When the Shard Key Is Absent
If a query filters by the shard key, the router sends it to exactly one shard. But many queries do not include the shard key. To answer them, the system must ask every shard and combine the answers. This pattern is scatter gather.
How Scatter Gather Runs
- Scatter: the query is sent to all shards in parallel.
- Each shard runs the query against its local subset.
- Gather: a coordinator merges the partial results into one answer.
Why It Is Expensive
- Latency is bounded by the slowest shard, not the average.
- A single slow or failing shard degrades every fan out query.
- Sorting, pagination, and aggregation must be redone at the coordinator after merging.
Reducing Fan Out
- Add a secondary index table keyed differently so lookups route to one shard.
- Denormalize so common queries carry the shard key.
- Maintain a separate search system for queries that inherently span shards.
Key idea
Queries lacking the shard key trigger scatter gather across all shards, with latency bounded by the slowest shard and a costly merge step.