Overfitting and Underfitting Revisited
A model can fail in two opposite ways, and naming them precisely guides every fix.
Underfitting happens when a model is too simple to capture the real pattern. It performs poorly on both the training data and new data. The model has high bias, meaning its assumptions are too rigid.
Overfitting happens when a model is so flexible that it memorizes the training data, including its noise. It scores beautifully on training data but poorly on unseen data. The model has high variance, meaning it reacts too much to the specific examples it saw.
You diagnose these by comparing two numbers:
- Training error measures fit on data the model learned from
- Validation error measures fit on held out data
The patterns are telling:
- Both errors high points to underfitting
- Training error low but validation error high points to overfitting
- Both errors low and close together is the sweet spot
The cure differs by case. Underfitting calls for a richer model or better features. Overfitting calls for more data, regularization, or a simpler model. The goal is the middle ground where the model captures signal but ignores noise.
Key idea
Underfitting is too rigid with high bias; overfitting memorizes noise with high variance; the gap between train and validation error reveals which you have.