← Lessons

quiz vs the machine

Gold1350

Machine Learning

Feature Scaling Normalization and Standardization

Put features on comparable ranges so distance and gradient based models behave well.

5 min read · core · beat Gold to climb

Feature Scaling Normalization and Standardization

Features often span wildly different ranges, such as age in years and income in thousands. Scaling brings them onto comparable scales so no single feature dominates by magnitude alone.

Two common scalers

  • Normalization, or min max scaling, rescales values into a fixed range like zero to one.
  • Standardization, or z score scaling, shifts to zero mean and scales to unit variance.

Standardization handles outliers and unknown bounds more gracefully, while min max preserves a bounded range useful for some neural network inputs.

Which models care

  • Distance based models such as KNN and SVM are sensitive to scale, because a large range feature dominates the distance.
  • Gradient based models train faster on scaled inputs because the loss surface is better conditioned.
  • Tree based models split on thresholds and are essentially scale invariant, so they rarely need it.

Fit the scaler on the training set and apply the same parameters to validation and test data to prevent leakage.

Key idea

Scaling equalizes feature ranges so distance and gradient based models behave well, while tree models stay scale invariant.

Check yourself

Answer to earn rating on the learn ladder.

1. What does standardization produce?

2. Which model type is essentially scale invariant?

3. Where should scaler parameters be fit?