← Lessons

quiz vs the machine

Gold1410

Machine Learning

Overfitting and Underfitting Revisited

The tug of war between memorizing and missing.

5 min read · core · beat Gold to climb

Overfitting and Underfitting Revisited

A model can fail in two opposite ways, and naming them precisely guides every fix.

Underfitting happens when a model is too simple to capture the real pattern. It performs poorly on both the training data and new data. The model has high bias, meaning its assumptions are too rigid.

Overfitting happens when a model is so flexible that it memorizes the training data, including its noise. It scores beautifully on training data but poorly on unseen data. The model has high variance, meaning it reacts too much to the specific examples it saw.

You diagnose these by comparing two numbers:

  • Training error measures fit on data the model learned from
  • Validation error measures fit on held out data

The patterns are telling:

  • Both errors high points to underfitting
  • Training error low but validation error high points to overfitting
  • Both errors low and close together is the sweet spot

The cure differs by case. Underfitting calls for a richer model or better features. Overfitting calls for more data, regularization, or a simpler model. The goal is the middle ground where the model captures signal but ignores noise.

Key idea

Underfitting is too rigid with high bias; overfitting memorizes noise with high variance; the gap between train and validation error reveals which you have.

Check yourself

Answer to earn rating on the learn ladder.

1. What signals overfitting?

2. Underfitting is associated with which property?

3. A good fix for overfitting is