Matrix Factorization For Recommendations

The setup

A recommendation problem is a giant, mostly empty matrix of users by items. Matrix factorization approximates it as the product of two smaller matrices, one for users and one for items.

Latent factors

Each user becomes a short vector of latent factors, and so does each item. A predicted rating is the dot product of a user vector and an item vector.

Factors are not labeled, but they often capture themes like genre or price level.
A high score means the user vector aligns with the item vector.
The model learns factors that reconstruct the observed ratings.

How it trains

Minimize squared error on known entries only, ignoring the empty cells.
Add regularization so factors do not overfit sparse data.
Solve with stochastic gradient descent or alternating least squares.

Why it works well

It generalizes to unseen user item pairs by combining their factors.
It scales far better than storing a full similarity matrix.
It often beats neighbor methods on sparse, large catalogs.

Key idea

Matrix factorization represents users and items as latent factor vectors whose dot product predicts ratings, generalizing across a sparse matrix far better than neighbor methods.

Matrix Factorization For Recommendations

The setup

Latent factors

How it trains

Why it works well

Key idea

Check yourself