Cross Shard Queries

Single Shard vs Scatter Gather

The happy path in a sharded system is the single shard query: the filter includes the shard key, so the router sends the request to exactly one node. It is fast and isolated.

Many real queries cannot be pinned to one shard. A query without the shard key, or one that aggregates across users, becomes a scatter gather: the coordinator broadcasts to all shards, collects partial results, and merges them.

The Costs

Latency is bounded by the slowest shard. With many shards, a tail latency on any one drags the whole query.
Fan out load multiplies. One client query becomes N shard queries, so a few scatter queries can saturate the cluster.
Merging aggregations, sorts, and limits across partial results needs careful logic in the coordinator.

Mitigations

Denormalize so common queries carry the shard key and stay single shard.
Precompute aggregates into summary tables refreshed in the background.
Bound fan out by limiting which queries are allowed to scatter, or by capping the number of shards a query may touch.

Key idea

Single shard queries are cheap; cross shard scatter gather queries cost slowest shard latency and N fold fan out, so design to keep queries single shard.

Cross Shard Queries

Single Shard vs Scatter Gather

The Costs

Mitigations

Key idea

Check yourself