The Parent Document Retrieval

The precision context conflict

Small chunks match queries precisely because their embeddings are focused. But a small chunk often lacks the surrounding context the model needs to answer well. Parent document retrieval resolves this by searching over small child chunks while returning their larger parent passage to the generator.

How it works

Each document is split twice. Small child chunks are embedded and indexed for search. Each child remembers which larger parent it came from.

A query matches the precise child chunk.
The system looks up that child's parent and returns the parent text instead.

So matching stays sharp while the context handed to the model stays rich.

Variations

The parent can be the full document, a section, or a sliding window around the child. Larger parents give more context but risk pulling in irrelevant material, so the parent size is itself a tuning knob.

Why it matters

This pattern decouples the unit you search from the unit you read. You get the recall of small precise embeddings and the completeness of large readable passages without choosing one or the other.

Key idea