Retrieval Chunking for Agents
Before documents can be retrieved by meaning, they must be split into pieces. Chunking decides those boundaries, and it quietly governs retrieval quality more than most people expect.
The size trade off
- Small chunks give precise matches but may lack surrounding context to be useful.
- Large chunks carry context but dilute the embedding, so the signal of a specific fact gets averaged away.
- A chunk that splits a sentence or table mid thought retrieves as half an idea.
Smarter boundaries
Naive fixed length splitting cuts arbitrarily. Better approaches respect structure: split on headings, paragraphs, or semantic shifts so each chunk is a coherent unit. Overlap between adjacent chunks reduces the chance a fact falls across a boundary and is never retrieved whole.
Advanced patterns
Some systems embed small chunks for precise matching but return a larger parent passage for context, a small to big strategy. Others store a summary alongside each chunk so retrieval matches the gist while the agent reads the detail. The right scheme depends on the data, so chunking should be tuned against a retrieval evaluation, not guessed once and forgotten.
Key idea
Chunking sets retrieval boundaries, and structure aware splitting with overlap finds coherent, complete passages instead of fragments.