BM25 Deep Dive

The classic scoring function that balances term frequency, rarity, and document length.

Why BM25 exists

BM25 scores how well a document matches a query using three ideas: how often a term appears, how rare the term is, and how long the document is.

The three ingredients

Term frequency raises the score when a query term appears more, but with diminishing returns, so the tenth occurrence adds less than the second.
Inverse document frequency raises the score for rare terms. A word in nearly every document carries little signal.
Length normalization prevents long documents from winning just because they contain more words.

The tuning knobs

BM25 has two parameters. One controls how fast term frequency saturates. The other controls how strongly length normalization applies. Defaults work well, but tuning helps for specific corpora.

Why it endures

BM25 is cheap to compute, needs no training, and is a strong baseline. Many modern systems still use it for first stage retrieval before applying heavier rankers.

Diagram

Key idea