Why shard
A full index does not fit on one machine, and one machine cannot serve all traffic. Sharding splits the index into pieces spread across servers.
Document sharding
The common approach is document sharding: each shard holds a subset of documents and a full inverted index over them. A query runs on every shard in parallel.
Scatter and gather
- A coordinator receives the query and scatters it to all shards.
- Each shard returns its top candidates with scores.
- The coordinator gathers and merges them into a final ranked page.
Because each shard ranks only its own documents, global statistics like document frequency may need sharing so scores stay comparable across shards.
Replication
Each shard is replicated for availability and throughput. The coordinator picks one replica per shard, balancing load and routing around failures.
Diagram
Key idea
Document sharding spreads the index across servers; a coordinator scatters each query and gathers per shard top results into one ranked page.