Single Shard vs Scatter Gather
The happy path in a sharded system is the single shard query: the filter includes the shard key, so the router sends the request to exactly one node. It is fast and isolated.
Many real queries cannot be pinned to one shard. A query without the shard key, or one that aggregates across users, becomes a scatter gather: the coordinator broadcasts to all shards, collects partial results, and merges them.
The Costs
- Latency is bounded by the slowest shard. With many shards, a tail latency on any one drags the whole query.
- Fan out load multiplies. One client query becomes N shard queries, so a few scatter queries can saturate the cluster.
- Merging aggregations, sorts, and limits across partial results needs careful logic in the coordinator.
Mitigations
- Denormalize so common queries carry the shard key and stay single shard.
- Precompute aggregates into summary tables refreshed in the background.
- Bound fan out by limiting which queries are allowed to scatter, or by capping the number of shards a query may touch.
Key idea
Single shard queries are cheap; cross shard scatter gather queries cost slowest shard latency and N fold fan out, so design to keep queries single shard.