Why two stages
You cannot run an expensive model over millions of items for every request. Industrial recommenders use a funnel: a cheap stage narrows the catalog, then an expensive stage carefully orders the survivors.
Stage one candidate generation
- Goal: from millions of items, retrieve a few hundred plausible candidates fast.
- Methods are lightweight: nearest neighbor lookup over embeddings, simple co interaction rules, or recent and popular items.
- Optimized for recall, not precision. Missing a good item here means it can never be recommended, so cast a wide net.
Stage two ranking
- Goal: order the few hundred candidates precisely.
- Uses a heavier model with rich features: user history, item features, context like time and device, and cross features.
- Optimized for precision at the top, since only a handful are shown.
Why split the work
The funnel spends compute where it matters. Cheap retrieval handles scale; the rich ranker handles quality on a small set. Many systems add a final re ranking stage for diversity and business rules.
Key idea
Recommenders use a funnel: cheap high recall candidate generation narrows millions to hundreds, then a rich high precision ranker orders the survivors.