When Replicas Are Not Enough
Read replicas scale reads, but every replica still applies every write, so they do nothing for write throughput. When a single primary cannot keep up with writes, you partition the data across multiple independent databases. This is sharding, also called horizontal partitioning.
What a Shard Is
- A shard is one database holding a subset of the rows.
- Each shard has its own primary and can have its own replicas.
- A shard key decides which shard a given row lives on.
How Routing Works
A routing layer hashes or maps the shard key to pick the target shard. For a user table sharded by user id, all rows for a given user land on one shard. Writes for different users now hit different machines, so aggregate write capacity grows roughly linearly with shard count.
The Costs
- Cross shard queries become expensive because data is spread out.
- Transactions across shards are hard and often avoided.
- Rebalancing when adding shards moves data and is operationally risky.
Key idea
Sharding scales writes by splitting rows across independent databases keyed by a shard key, at the cost of harder cross shard queries and transactions.