Uniform votes are wasteful
Basic k nearest neighbors lets all k neighbors vote equally. But a neighbor right next to the query point is far more relevant than one at the edge of the neighborhood. Weighting schemes fix this.
Distance weighting
- With uniform weights every neighbor counts the same.
- With distance weights each neighbor counts in proportion to one over its distance, so closer points dominate.
- Custom kernels, such as a Gaussian on distance, give a smooth falloff of influence.
Effects on the model
- Distance weighting reduces sensitivity to the exact choice of k, since far neighbors barely matter.
- It smooths decision boundaries and can improve accuracy when density varies.
- For regression the prediction becomes a weighted average of neighbor targets.
Cautions
- Inverse distance weights blow up when a query coincides with a training point, so implementations cap or special case zero distance.
- Always scale features first, or one large scale feature will dominate the distance.
Key idea
KNN weighting makes nearer neighbors count more, usually by inverse distance. This reduces sensitivity to k and smooths boundaries, but requires feature scaling and care when distance is zero.