← Lessons

quiz vs the machine

Gold1500

Machine Learning

The Cross Encoder Reranking Deep

Score each candidate by reading the query and passage together.

5 min read · core · beat Gold to climb

Why a second pass

The vector search that finds candidates uses a bi encoder, which embeds the query and each passage separately, then compares vectors. That is fast but coarse, because the passage never sees the query while being encoded. A cross encoder reranker fixes the top candidates by reading query and passage together.

How a cross encoder differs

  • A bi encoder produces one vector per text in advance, so it scales to millions of passages but cannot model fine interactions.
  • A cross encoder feeds the query and one passage into a single model at once, letting every query token attend to every passage token, then outputs a relevance score.

This joint attention captures subtle relevance that separate embeddings miss, but it must run once per candidate, so it is far too slow to score the whole index.

The retrieve then rerank pattern

The bi encoder cheaply narrows millions of passages to a few dozen. The cross encoder then carefully reorders just those, putting the truly best passages on top before generation.

Why it matters

This two stage design buys the accuracy of joint encoding at the cost of scoring only a small shortlist, a practical balance of recall and precision.

Key idea

A cross encoder reranker reads query and passage jointly to score relevance precisely, applied only to the bi encoder shortlist so accuracy rises without scoring the whole index.

Check yourself

Answer to earn rating on the learn ladder.

1. How does a cross encoder differ from a bi encoder?

2. Why is the cross encoder applied only to a shortlist?