← Lessons

quiz vs the machine

Platinum1800

System Design

Design YouTube Video Recommendations

Generate personalized video suggestions with candidate generation and ranking.

7 min read · advanced · beat Platinum to climb

Requirements

  • Suggest videos a user is likely to watch next.
  • Serve recommendations with low latency from a catalog of billions.
  • Balance relevance with freshness and diversity.

High level design

Recommendations use a two stage funnel of candidate generation then ranking.

  • Candidate generation: cheap models retrieve a few hundred candidates from billions using embeddings and approximate nearest neighbor search.
  • Ranking: a heavier model scores each candidate using rich features like watch history, recency, and engagement signals.
  • Serving: precompute candidate sets offline where possible and rank at request time.

Bottlenecks

  • Catalog size: scanning billions per request is impossible, so the funnel narrows first with cheap retrieval.
  • Feature freshness: a feature store serves up to date signals to the ranker.
  • Feedback loops: popular videos get more exposure, so inject exploration and diversity.

Tradeoffs

  • More candidates improve recall but raise ranking cost.
  • Heavier ranking models improve quality but add latency.

Key idea

Recommendations are a two stage funnel where cheap retrieval narrows billions to hundreds and a heavy ranker orders the survivors with rich features.

Check yourself

Answer to earn rating on the learn ladder.

1. Why use a two stage candidate generation then ranking funnel?

2. Why inject exploration and diversity into recommendations?