The K Nearest Neighbors

A lazy learner

K nearest neighbors stores the training data and does no real fitting. To predict, it finds the k closest stored points and lets them vote for classification or average for regression.

Choosing k

Small k follows the data closely but is sensitive to noise.
Large k smooths predictions but can blur real boundaries.
Odd k avoids ties in binary voting.

What it depends on

A distance metric to decide who is near.
Feature scaling, since a large scale feature can dominate the distance.
A reasonable number of features, because distances blur in very high dimensions.

Costs and tradeoffs

Training is instant; prediction is slow since it scans the data.
Memory grows with the dataset.
Spatial indexes like trees speed up neighbor search in low dimensions.

Key idea

K nearest neighbors predicts from the closest stored examples. It needs a distance metric and scaled features, and shifts all the cost from training to prediction time.

The K Nearest Neighbors

A lazy learner

Choosing k

What it depends on

Costs and tradeoffs

Key idea

Check yourself