The Resharding Problem
A naive way to spread keys across shards is to take the hash of a key modulo the node count. The trouble is that changing the node count changes nearly every assignment, forcing a massive data move. Consistent hashing fixes this.
The Ring
Imagine hash values arranged on a circular ring from zero to a large maximum. Each node is hashed to one or more points on the ring. To place a key, hash it and walk clockwise until you hit the first node. That node owns the key.
Why It Helps
- Adding a node only steals keys from its clockwise neighbor, so on average only a fraction of keys move.
- Removing a node hands its keys to the next node, again touching a small slice.
Virtual Nodes
A single point per node gives uneven load. Each physical node is placed at many points called virtual nodes, smoothing the distribution and making rebalancing fairer.
Key idea
Consistent hashing places keys and nodes on a ring so adding or removing a node only reassigns a small fraction of keys instead of nearly all of them.