← Lessons

quiz vs the machine

Gold1450

Machine Learning

Candidate Generation and Ranking

The two stage funnel that turns millions of items into a short ranked list.

5 min read · core · beat Gold to climb

Why two stages

You cannot run an expensive model over millions of items for every request. Industrial recommenders use a funnel: a cheap stage narrows the catalog, then an expensive stage carefully orders the survivors.

Stage one candidate generation

  • Goal: from millions of items, retrieve a few hundred plausible candidates fast.
  • Methods are lightweight: nearest neighbor lookup over embeddings, simple co interaction rules, or recent and popular items.
  • Optimized for recall, not precision. Missing a good item here means it can never be recommended, so cast a wide net.

Stage two ranking

  • Goal: order the few hundred candidates precisely.
  • Uses a heavier model with rich features: user history, item features, context like time and device, and cross features.
  • Optimized for precision at the top, since only a handful are shown.

Why split the work

The funnel spends compute where it matters. Cheap retrieval handles scale; the rich ranker handles quality on a small set. Many systems add a final re ranking stage for diversity and business rules.

Key idea

Recommenders use a funnel: cheap high recall candidate generation narrows millions to hundreds, then a rich high precision ranker orders the survivors.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the main goal of the candidate generation stage?

2. Why use a two stage funnel instead of one model?

3. Which stage is optimized for precision at the top?