← Lessons

quiz vs the machine

Silver1080

Machine Learning

Content Based Filtering

Recommend items similar to what a user already liked using item attributes.

4 min read · intro · beat Silver to climb

The idea

Content based filtering builds a profile from the attributes of items a user engaged with, then recommends other items whose attributes match. If you watched several space documentaries, it scores other documentaries about space highly.

How it works

  • Describe each item with a feature vector. For text this might be TF IDF over words; for movies it could be genre, cast, and tags.
  • Build a user profile by aggregating the vectors of items they liked, often a weighted average.
  • Score a candidate item by its similarity to the user profile, commonly cosine similarity.

Strengths

  • Works for a new item the moment it has attributes, even with zero interactions.
  • Recommendations are explainable: because you liked X which is about space.
  • No dependency on other users, so it works even with a small user base.

Weaknesses

  • It tends to stay inside a narrow lane and rarely surprises the user.
  • It needs good attributes, which are expensive to curate.

Key idea

Content based filtering matches items to a user profile built from item attributes, giving explainable picks but limited novelty.

Check yourself

Answer to earn rating on the learn ladder.

1. What does content based filtering rely on?

2. Which is a known weakness of content based filtering?