← Lessons

quiz vs the machine

Gold1400

Machine Learning

Reranking Retrieved Results

A second pass that reorders candidates for sharper relevance.

5 min read · core · beat Gold to climb

What it is

Reranking is a second stage in retrieval that reorders an initial candidate list by relevance. A fast retriever pulls many candidates, then a slower, more accurate model scores each one against the query and sorts them.

Why two stages

A single vector search is fast but coarse. It compares a query embedding to document embeddings independently, missing fine interactions.

  • The retriever casts a wide net cheaply, returning maybe the top hundred.
  • The reranker examines each query and document pair together and outputs a precise score.

Cross encoders

The common reranker is a cross encoder. It feeds the query and a document into one transformer so attention links every query token to every document token. This is far more accurate than comparing separate embeddings, but it cannot be precomputed, so it only runs on the shortlist.

The result is high recall from the retriever and high precision at the top from the reranker, which matters because an LLM usually reads only the first few passages.

Key idea

Reranking adds a precise cross encoder pass over a cheap retriever's shortlist, pushing the most relevant passages to the top where the LLM reads them.

Check yourself

Answer to earn rating on the learn ladder.

1. Why is a cross encoder run only on a shortlist rather than the whole corpus?

2. What does the retriever stage contribute in a rerank pipeline?