← Lessons

quiz vs the machine

Gold1400

Machine Learning

The Learning Curve Diagnosis

Plot error versus training set size to decide if more data or more capacity helps.

4 min read · core · beat Gold to climb

The Learning Curve Diagnosis

A learning curve plots training and validation error as a function of the number of training examples. Its shape tells you whether gathering more data or changing the model is the right next move.

How to read it

  • High bias shows both curves converging to a high error. The lines meet but at a poor level. More data will not help much.
  • High variance shows a large gap between low training error and higher validation error. The gap shrinks as data grows, so more data does help.

Acting on the curve

  • For high bias, add capacity or features rather than data.
  • For high variance, add data, regularize, or simplify the model.
  • A healthy curve shows both errors converging to a low value with a small gap.

Learning curves prevent wasted effort. Collecting more data is expensive, so confirm it will actually help before investing.

Key idea

A learning curve plots error against training set size, where converging high error signals bias needing capacity and a large gap signals variance that more data can close.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a learning curve plot error against?

2. Both curves converge to a high error. What does this indicate?