← Lessons

quiz vs the machine

Gold1420

Databases

Write Scaling with Sharding

When one primary cannot absorb the write load, sharding splits data across independent databases.

5 min read · core · beat Gold to climb

When Replicas Are Not Enough

Read replicas scale reads, but every replica still applies every write, so they do nothing for write throughput. When a single primary cannot keep up with writes, you partition the data across multiple independent databases. This is sharding, also called horizontal partitioning.

What a Shard Is

  • A shard is one database holding a subset of the rows.
  • Each shard has its own primary and can have its own replicas.
  • A shard key decides which shard a given row lives on.

How Routing Works

A routing layer hashes or maps the shard key to pick the target shard. For a user table sharded by user id, all rows for a given user land on one shard. Writes for different users now hit different machines, so aggregate write capacity grows roughly linearly with shard count.

The Costs

  • Cross shard queries become expensive because data is spread out.
  • Transactions across shards are hard and often avoided.
  • Rebalancing when adding shards moves data and is operationally risky.

Key idea

Sharding scales writes by splitting rows across independent databases keyed by a shard key, at the cost of harder cross shard queries and transactions.

Check yourself

Answer to earn rating on the learn ladder.

1. Why do read replicas fail to scale writes?

2. What does the shard key determine?

3. Which operation becomes notably harder under sharding?