The boundary problem
When you cut a document into chunks, the key sentence for a query may land right on a boundary, split between two pieces. Neither chunk then holds the full thought, and retrieval can miss it. Chunk overlap copies a slice of text from the end of one chunk into the start of the next so ideas that straddle a cut survive in at least one piece.
How overlap works
Overlap is usually set as a fraction of chunk size, often ten to twenty percent. With a five hundred token chunk and a fifty token overlap, each chunk repeats the last fifty tokens of its predecessor.
- Too little overlap risks splitting a sentence or a definition across chunks.
- Too much overlap stores the same text many times, inflating the index and returning near duplicate results.
Tuning it
Set overlap large enough to capture a typical complete thought, such as a sentence or a short list, but no larger. Watch for duplicate hits in retrieval results, a sign overlap is wasting space.
Why it matters
Overlap is a cheap insurance against boundary loss, but unbounded overlap quietly bloats storage and crowds out diverse results.
Key idea
Chunk overlap copies a small slice between neighbors so boundary spanning ideas survive, tuned just large enough to hold a complete thought without bloating the index.