Candidate Generation and Ranking

Why two stages

You cannot run an expensive model over millions of items for every request. Industrial recommenders use a funnel: a cheap stage narrows the catalog, then an expensive stage carefully orders the survivors.

Stage one candidate generation

Goal: from millions of items, retrieve a few hundred plausible candidates fast.
Methods are lightweight: nearest neighbor lookup over embeddings, simple co interaction rules, or recent and popular items.
Optimized for recall, not precision. Missing a good item here means it can never be recommended, so cast a wide net.

Stage two ranking

Goal: order the few hundred candidates precisely.
Uses a heavier model with rich features: user history, item features, context like time and device, and cross features.
Optimized for precision at the top, since only a handful are shown.

Why split the work

The funnel spends compute where it matters. Cheap retrieval handles scale; the rich ranker handles quality on a small set. Many systems add a final re ranking stage for diversity and business rules.

Key idea

Recommenders use a funnel: cheap high recall candidate generation narrows millions to hundreds, then a rich high precision ranker orders the survivors.

Candidate Generation and Ranking

Why two stages

Stage one candidate generation

Stage two ranking

Why split the work

Key idea

Check yourself