← Lessons

quiz vs the machine

Silver1100

Machine Learning

The Chunking Strategies Deep

How splitting documents into pieces shapes what a retriever can find.

5 min read · intro · beat Silver to climb

Why chunk

Documents are too long to embed as one vector and too long to paste whole into a prompt. Chunking splits each document into smaller pieces that are embedded and retrieved independently. The size and shape of those pieces decides what the retriever can match.

Common strategies

  • Fixed size cuts every N characters or tokens. Simple and fast, but it can slice sentences in half.
  • Sentence based splits on sentence boundaries so each chunk reads cleanly.
  • Structure aware respects headings, paragraphs, or markdown sections so a chunk stays on one topic.

The size tradeoff

Small chunks are precise but lose surrounding context, so an answer may need several of them stitched together. Large chunks carry context but dilute the embedding, mixing several topics into one vector and lowering match quality.

Why it matters

Chunking is often the highest leverage knob in a RAG system. A bad split scatters an answer across many pieces or buries the key sentence inside a noisy block, and no reranker can fully repair it.

Key idea

Chunking decides the unit of retrieval, and the size tradeoff between precision and context makes it one of the most impactful choices in a RAG pipeline.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the main risk of very large chunks?

2. Why does structure aware chunking often help?