Why BM25 exists
BM25 scores how well a document matches a query using three ideas: how often a term appears, how rare the term is, and how long the document is.
The three ingredients
- Term frequency raises the score when a query term appears more, but with diminishing returns, so the tenth occurrence adds less than the second.
- Inverse document frequency raises the score for rare terms. A word in nearly every document carries little signal.
- Length normalization prevents long documents from winning just because they contain more words.
The tuning knobs
BM25 has two parameters. One controls how fast term frequency saturates. The other controls how strongly length normalization applies. Defaults work well, but tuning helps for specific corpora.
Why it endures
BM25 is cheap to compute, needs no training, and is a strong baseline. Many modern systems still use it for first stage retrieval before applying heavier rankers.
Diagram
Key idea
BM25 rewards frequent and rare terms with diminishing returns while normalizing for document length, giving a strong training free baseline.