← Lessons

quiz vs the machine

Gold1470

Machine Learning

The Singular Value Decomposition

Factoring any matrix into rotation, scaling, and rotation.

5 min read · core · beat Gold to climb

The Singular Value Decomposition

The singular value decomposition, or SVD, factors any matrix into three simpler matrices. It is the engine behind PCA, recommender systems, and low rank compression.

The factorization

For a matrix A, the SVD writes it as U times Sigma times V transpose.

  • U holds the left singular vectors, an orthonormal basis for the output space.
  • Sigma is diagonal and holds the singular values, which are non negative and sorted from large to small.
  • V transpose holds the right singular vectors, a basis for the input space.

Geometrically the decomposition says any linear map is a rotation, a scaling, and another rotation.

Low rank approximation

The singular values measure how much each direction contributes. Keeping only the largest k singular values and their vectors gives the best rank k approximation of the matrix. This is how SVD compresses images and powers latent factor models.

Link to PCA

PCA on centered data is exactly the SVD of that data matrix. The right singular vectors are the principal components, and the squared singular values are proportional to the explained variances.

Key idea

SVD factors any matrix into rotation scaling rotation, and truncating its singular values gives the best low rank approximation underpinning PCA.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the Sigma matrix in an SVD contain?

2. How do you form the best rank k approximation of a matrix?

3. How does SVD relate to PCA?