← Lessons

quiz vs the machine

Silver1120

Databases

Sharding Strategies Range and Hash

Splitting data across shards by range or by hash.

5 min read · intro · beat Silver to climb

Why Shard

Sharding splits a dataset across multiple nodes so each node holds only a slice. This scales write throughput and storage, unlike replicas which copy everything. The key question is how to decide which row goes to which shard.

Range Sharding

Range sharding assigns contiguous key ranges to shards. Keys A to M go to shard one, N to Z to shard two.

  • Pro Range scans are efficient because nearby keys sit together.
  • Con Uneven distribution. If recent timestamps are the key, all new writes pile onto one shard, creating a hot spot.

Hash Sharding

Hash sharding applies a hash function to the key and uses the result to pick a shard.

  • Pro Even spread. The hash scatters keys uniformly, smoothing load across shards.
  • Con Range scans become expensive because adjacent keys land on different shards, so a scan must hit every shard.

Choosing

Pick range sharding when ordered scans dominate, such as time series queries over a window. Pick hash sharding when point lookups dominate and you fear hot spots. Many systems combine the two, hashing a high level key then ranging within it.

Key idea

Range sharding keeps neighbors together for scans but risks hot spots; hash sharding spreads load evenly but scatters ranges across shards.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the main weakness of range sharding?

2. Why are range scans costly under hash sharding?