Sharding Strategies Range and Hash

Why Shard

Sharding splits a dataset across multiple nodes so each node holds only a slice. This scales write throughput and storage, unlike replicas which copy everything. The key question is how to decide which row goes to which shard.

Range Sharding

Range sharding assigns contiguous key ranges to shards. Keys A to M go to shard one, N to Z to shard two.

Pro Range scans are efficient because nearby keys sit together.
Con Uneven distribution. If recent timestamps are the key, all new writes pile onto one shard, creating a hot spot.

Hash Sharding

Hash sharding applies a hash function to the key and uses the result to pick a shard.

Pro Even spread. The hash scatters keys uniformly, smoothing load across shards.
Con Range scans become expensive because adjacent keys land on different shards, so a scan must hit every shard.

Choosing

Pick range sharding when ordered scans dominate, such as time series queries over a window. Pick hash sharding when point lookups dominate and you fear hot spots. Many systems combine the two, hashing a high level key then ranging within it.

Key idea