Today, I’ll talk about how the Consistent Hashing Algorithm works, which will be followed by how it is utilized in OpenStack Swift Rings in my next blog entry. To start off, let’s talk about the Basic Hash Function.
1. Basic Hash Function
Basic hash function maps the objects into different drives based on the hash of an object so it can be fetched later. It is probably best to think of the basic hashing algorithm as a typical encyclopedia; it doesn’t matter how many times I look up the information on computers, I will always be checking for a CAL – EDU volume (as long as I am looking through the same edition). Think of the encyclopedia volume as a drive, and the encyclopedia volume number as a mapping value.
Let’s say I have a series of objects I want to store, and I have 4 drives (or partitions) in my storage server, which are labeled Drive 0, 1, 2 and 3. In basic hash function, it will take the object data and hash it using the MD5 hashing algorithm, to produce a shorter, fixed length. It generates a hexadecimal value of length 32, which can be converted into decimal value. Finally, it will divide that hash value by the number of drives; in our example, it is 4, because I have 4 drives. It then stores that object based on the remainder of the division, which will be any value from 0 to 3 – let’s call this value a mapping value.
Here are the example objects I want to store and their hash values:
Table 1.1. Mapping of Objects to Different Drives using Basic Hash Function
|Mapping of Objects to Different Drives|
|Object||Hash Value (Hexadecimal)||Mapping Value||Drive Mapped To|
|Image 1||b5e7d988cfdb78bc3be1a9c221a8f744||hash(Image 1) % 4 = 2||Drive 2|
|Image 2||943359f44dc87f6a16973c79827a038c||hash(Image 2) % 4 = 3||Drive 3|
|Image 3||1213f717f7f754f050d0246fb7d6c43b||hash(Image 3) % 4 = 3||Drive 3|
|Music 1||4b46f1381a53605fc0f93a93d55bf8be||hash(Music 1) % 4 = 1||Drive 1|
|Music 2||ecb27b466c32a56730298e55bcace257||hash(Music 2) % 4 = 0||Drive 0|
|Music 3||508259dfec6b1544f4ad6e4d52964f59||hash(Music 3) % 4 = 0||Drive 0|
|Movie 1||69db47ace5f026310ab170b02ac8bc58||hash(Movie 1) % 4 = 2||Drive 2|
|Movie 2||c4abbd49974ba44c169c220dadbdac71||hash(Movie 2) % 4 = 1||Drive 1|
But what if we have to add/remove drives? The hash values of all objects will stay the same, but we need to re-compute the mapping value for all objects, then re-map them to the different drives.
That’s too much work for our servers.
2. Consistent Hashing Algorithm
Consistent hashing algorithm achieves a similar goal but does things differently. It will still hash the object data, but instead of getting the mapping value of each object, each drive will be assigned a range of hash values to store the objects. Again, think of this as an encyclopedia; each volume will be the drive, except that the range of first 3 letters of information each volume contains is like the hash value of each object mapped to a drive accordingly.
Table 2.1. Range of Hash Values for Each Drive
|Range of Hash Values for Each Drive|
|Drive||Range of Hash Values|
|Drive 0||0000… ~ 3fff…|
|Drive 1||3fff… ~ 7ffe…|
|Drive 2||7fff… ~ bffd…|
|Drive 3||bffd… ~ ffff…|
Note: This is just an example. Hash values are much longer.
Table 2.2. Range of Hash Values for Each Drive
|Mapping of Objects to Different Drives|
|Object||Hash Value (Hexadecimal)||Drive Mapped To|
|Image 1||b5e7d988cfdb78bc3be1a9c221a8f744||Drive 2|
|Image 2||943359f44dc87f6a16973c79827a038c||Drive 2|
|Image 3||1213f717f7f754f050d0246fb7d6c43b||Drive 0|
|Music 1||4b46f1381a53605fc0f93a93d55bf8be||Drive 1|
|Music 2||ecb27b466c32a56730298e55bcace257||Drive 3|
|Music 3||508259dfec6b1544f4ad6e4d52964f59||Drive 1|
|Movie 1||69db47ace5f026310ab170b02ac8bc58||Drive 1|
|Movie 2||c4abbd49974ba44c169c220dadbdac71||Drive 3|
Now if I added additional drives, only thing that changes is each drive will get a new range of hash values it is going to store. Each object’s hash value will still remain the same. Any objects whose hash value is within range of its current drive will remain. For any other objects whose hash value is not within range of its current drive will be mapped to another drive; but that number of objects is very few using consistent hashing algorithm, compared to the basic hash function.
I’ll add another drive and re-illustrate my point on the picture below:
Notice how only Movie 2 and Music 2 objects were mapped to my new drive (drive 4), and Image 1 had to be mapped to drive 3. If we used basic hash function, we would most likely have to re-calculate the mapping values for all objects, and re-map them accordingly. Imagine how much workload that is for thousands, or even millions of objects.
But there’s more to it once it’s modified.
3. Multiple Markers in Consistent Hashing Algorithm
First, let’s look at what the multiple markers do for us.
Remember in consistent hashing algorithm, each drive has one big range of hash values to map the objects. Multiple markers helps to evenly distribute the objects into drives, thus helping with the load balancing, but how?
Instead of having one big hash range for each drive, multiple markers serve to split those large hash range into smaller chunks, and those smaller hash ranges will be assigned to different drives in the server. How does that help?
Let’s say I have 20 objects I want to store, and I still have 4 drives, each with different range of hash values of equal length. But what if out of those 20 objects, maybe 14 are mapped to drive 0, and the rest are equally distributed to drives 1, 2, and 3? This causes the ring to be unbalanced in weight, because drive 0 holds much more hash values than the rest of the drives. This is where the smaller hash ranges can help a lot with load balancing.
As mentioned earlier, consistent hashing algorithm uses multiple markers for the drives to map several smaller ranges of hash values instead of one big range. This has two positive effects: First, if the new drive was to be added, that new drive will gain more objects from all other existing drives in the server, instead of just a few objects from a neighboring drive – this results in more and smaller hash ranges. Likewise, if one of the existing drive was to be removed, all objects that drive was holding onto will be evenly distributed to the other existing drives – results in less and larger hash ranges. Second, by doing this, the overall distribution of objects will be fairly even, meaning the weight among different drives will be very close to evenly distributed – helps with load balancing.
Picture above shows several objects close to each other in terms of its hash value are distributed among different segments of the different drives. Multiple markers splits 4 big hash ranges into several smaller hash ranges and assigns them into all other drives.
On my next entry, I will talk about how Swift utilizes this algorithm and how it takes a different approach.