The precision context conflict
Small chunks match queries precisely because their embeddings are focused. But a small chunk often lacks the surrounding context the model needs to answer well. Parent document retrieval resolves this by searching over small child chunks while returning their larger parent passage to the generator.
How it works
Each document is split twice. Small child chunks are embedded and indexed for search. Each child remembers which larger parent it came from.
- A query matches the precise child chunk.
- The system looks up that child's parent and returns the parent text instead.
So matching stays sharp while the context handed to the model stays rich.
Variations
The parent can be the full document, a section, or a sliding window around the child. Larger parents give more context but risk pulling in irrelevant material, so the parent size is itself a tuning knob.
Why it matters
This pattern decouples the unit you search from the unit you read. You get the recall of small precise embeddings and the completeness of large readable passages without choosing one or the other.
Key idea
Parent document retrieval searches over small precise child chunks but returns their larger parent passages, decoupling the unit you match from the unit the model reads.