The phenomenon
The curse of dimensionality describes how high dimensional spaces behave in unintuitive ways. As you add features, the volume of the space grows so fast that your data becomes hopelessly sparse.
Why it hurts
- To keep the same density of points, the data needed grows exponentially with dimensions.
- In high dimensions, the distance to the nearest and farthest points becomes nearly equal.
- When all points seem equally far away, distance based methods like KNN lose their grip.
Where it bites
- Distance methods degrade because neighborhoods stop being meaningful.
- Models gain capacity to overfit since there is empty space to memorize.
- Visual intuition from two or three dimensions stops applying.
How to fight it
- Use feature selection to drop irrelevant inputs.
- Apply dimensionality reduction to compress to a useful few axes.
- Prefer models that build in strong structure or regularization.
Key idea
In high dimensions space grows so fast that data becomes sparse and distances flatten, breaking distance based methods unless you reduce dimensions.