← Lessons

quiz vs the machine

Platinum1740

System Design

The Spotify Recommendation

Collaborative filtering and offline candidate generation power personalized playlists.

6 min read · advanced · beat Platinum to climb

Personalized at scale

Spotify recommends tracks tuned to each listener. The core insight is collaborative filtering: people with similar taste are good predictors of what you will like next.

Embeddings and similarity

Users and tracks are mapped into a shared embedding space so similar items sit close together. A recommendation becomes a nearest neighbor search around your taste vector.

  • Listening history trains user and track embeddings
  • Candidate tracks are the nearest neighbors to your vector
  • A ranking model orders candidates before display

Offline candidates, online ranking

Heavy embedding training runs offline in batch. At request time the system does a fast nearest neighbor lookup to fetch candidates, then a lighter model ranks them. This two stage split keeps responses quick.

Quality comes from learned similarity, and speed comes from pushing the expensive training offline while keeping serving cheap.

Key idea

Learn user and track embeddings offline, retrieve candidates with a fast nearest neighbor lookup, then rank them online so recommendations stay both personal and fast.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the idea behind collaborative filtering?

2. Why is embedding training done offline rather than per request?