← Lessons

quiz vs the machine

Gold1450

Machine Learning

The Cross Encoder Versus Bi Encoder

Accuracy versus speed in how you compare two pieces of text.

6 min read · core · beat Gold to climb

Two ways to compare

A bi encoder embeds each text independently into a vector, then compares vectors with a cheap similarity. A cross encoder feeds both texts together through one transformer and outputs a single relevance score.

The tradeoff

  • The bi encoder is fast and scalable: you embed a corpus once and reuse the vectors for every query, enabling search over millions of items.
  • The cross encoder is more accurate because attention runs across both texts at once, capturing fine interactions, but it must run fresh for every pair and cannot precompute.

Why not always use the cross encoder

Scoring a query against a million documents with a cross encoder means a million expensive transformer passes per query, which is far too slow. The bi encoder precomputes document vectors so the query side is a single embedding plus fast nearest neighbor search.

The standard pattern

A common production design is retrieve then rerank:

  • The bi encoder quickly retrieves a few hundred candidates.
  • The cross encoder reranks just those candidates for final precision.

This combines the speed of one with the accuracy of the other.

Key idea

Bi encoders embed texts separately for fast scalable retrieval while cross encoders score pairs jointly for higher accuracy, and a retrieve then rerank pipeline uses each where it is strongest.

Check yourself

Answer to earn rating on the learn ladder.

1. Why is a bi encoder more scalable than a cross encoder?

2. Why is a cross encoder usually more accurate?

3. What does the retrieve then rerank pattern combine?