← Lessons

quiz vs the machine

Platinum1820

System Design

Learning to Rank in Search

Using a trained model to order results from many relevance features.

6 min read · advanced · beat Platinum to climb

From hand tuned to learned

Hand tuned boosts get unwieldy once you have dozens of signals. Learning to rank trains a model to combine features into a single relevance order, learning the weights from data instead of guessing them.

The two phase pipeline

Scoring every document with a heavy model is too slow, so ranking is split:

  • Retrieval uses a cheap method like BM25 to fetch a candidate set, often a few hundred documents.
  • Re ranking applies the learned model only to those candidates to produce the final order.

Features and objectives

Features mix query signals, document signals, and query document interactions like BM25 score or click rate. The model is trained on labels, which may be human judgments or clicks turned into preferences.

Training objectives fall into families:

  • Pointwise predicts a score per document.
  • Pairwise learns which of two documents should rank higher.
  • Listwise optimizes the whole ordering and a metric like discounted cumulative gain directly.

Pairwise and listwise usually beat pointwise because search quality is about order, not absolute scores.

Key idea

Learning to rank cheaply retrieves candidates then re ranks them with a trained model over many features, usually with pairwise or listwise objectives that target order.

Check yourself

Answer to earn rating on the learn ladder.

1. Why is learning to rank split into retrieval and re ranking?

2. Why do pairwise and listwise objectives usually beat pointwise?

3. What kinds of features feed a learning to rank model?