Feature Scaling Normalization and Standardization
Features often span wildly different ranges, such as age in years and income in thousands. Scaling brings them onto comparable scales so no single feature dominates by magnitude alone.
Two common scalers
- Normalization, or min max scaling, rescales values into a fixed range like zero to one.
- Standardization, or z score scaling, shifts to zero mean and scales to unit variance.
Standardization handles outliers and unknown bounds more gracefully, while min max preserves a bounded range useful for some neural network inputs.
Which models care
- Distance based models such as KNN and SVM are sensitive to scale, because a large range feature dominates the distance.
- Gradient based models train faster on scaled inputs because the loss surface is better conditioned.
- Tree based models split on thresholds and are essentially scale invariant, so they rarely need it.
Fit the scaler on the training set and apply the same parameters to validation and test data to prevent leakage.
Key idea
Scaling equalizes feature ranges so distance and gradient based models behave well, while tree models stay scale invariant.