Overfitting And Underfitting

Recognize when a model memorizes noise versus when it fails to learn the signal.

Overfitting And Underfitting

Every model lives somewhere between two failure modes. Underfitting means it has not learned the pattern, while overfitting means it has learned the noise. The gap between training and validation scores is your main clue.

Underfitting

The model performs poorly on both training and validation data.
Causes include too few features, too little capacity, or excessive regularization.
Fixes: add features, use a richer model, or reduce regularization.

Overfitting

The model performs well on training data but poorly on validation data.
It has memorized quirks and noise that do not generalize.
Fixes: gather more data, simplify the model, add regularization, or use early stopping.

Reading the gap

A small gap with high error signals underfitting. A large gap signals overfitting. The goal is a model that generalizes, where validation error is both low and close to training error.

Key idea

Underfitting shows poor scores everywhere while overfitting shows a large train to validation gap, and the cure depends on which failure mode you diagnose.

Overfitting And Underfitting