Factorizing the rating matrix
Matrix factorization explains a giant user item matrix with two thin matrices: a vector per user and a vector per item. The predicted score is the dot product of a user vector and an item vector, so each latent dimension acts like a hidden taste factor.
Why alternating least squares
Fitting both matrices at once is not convex. But if you fix the item vectors, solving for user vectors becomes a plain least squares problem, and the same holds the other way.
- Fix items, solve for all users in closed form.
- Fix users, solve for all items in closed form.
- Repeat until the error stops shrinking.
This is alternating least squares, or ALS.
Handling implicit feedback
With clicks rather than ratings, ALS uses a confidence weight that grows with the number of interactions, treating unobserved pairs as weak negatives rather than missing.
Practical notes
- Each ALS step is embarrassingly parallel across users or items, so it scales on clusters.
- A regularization term keeps vectors small and fights overfitting.
Key idea
ALS factorizes the rating matrix into user and item vectors by alternating two convex least squares solves, scaling cleanly and extending to implicit feedback with confidence weights.