Why combine the two
Lexical search like BM25 nails exact terms but misses paraphrases. Vector search captures meaning but fumbles precise identifiers. Hybrid search runs both and fuses their results, aiming to get the strengths of each.
How fusion works
The two retrievers produce different score scales, so you cannot simply add the numbers. Two common strategies handle this:
- Reciprocal rank fusion ignores raw scores and combines results by their rank in each list, which is simple and robust.
- Weighted score fusion normalizes each score range then blends with a tunable weight, giving more control at the cost of calibration.
Often hybrid results then pass to a re ranker for a final order, since fusion only produces a candidate pool.
Practical notes
- The vote of each retriever can be weighted by query type; a query full of rare tokens leans lexical, a vague natural language question leans vector.
- Maintaining two indexes costs more storage and operational complexity, a real trade off against the recall gain.
Key idea
Hybrid search runs lexical and vector retrieval together and fuses them by rank or normalized score, capturing exact matches and semantic recall in one ranking.