Principal Component Analysis

The purpose

Principal component analysis, or PCA, reduces the number of features while keeping as much information as possible. It finds new axes, called principal components, that capture the directions of greatest variance in the data.

How it works

PCA looks at how features vary together:

Center the data by subtracting each feature's mean
Find the directions along which the data spreads the most
These directions are ordered, so the first component captures the most variance, the second the next most, and so on

Keeping only the top components projects the data into fewer dimensions while preserving most of its spread.

Why use it

Speeds up downstream models by cutting dimensionality
Helps visualization by projecting to two or three dimensions
Can reduce noise by dropping low variance directions

Cautions

PCA components are linear combinations of original features, so they can be hard to interpret. Features should be scaled first, since PCA is sensitive to magnitude.

Key idea