The Validation Curve
A validation curve plots error against some axis, often training time or model complexity, for both the training set and the validation set. Reading it is one of the most practical skills in machine learning.
As training proceeds, the typical story unfolds in stages:
- Early on both training error and validation error fall together as the model learns real signal
- They reach a point of best validation performance
- Eventually training error keeps dropping while validation error starts to rise
That rising validation error is the visible fingerprint of overfitting. The growing gap between the two curves tells you the model is now fitting noise specific to the training data.
The lowest point of the validation curve marks the sweet spot. It is the model that generalizes best. Picking a model from there, rather than the one with the lowest training error, is the whole point of holding out validation data.
You can plot the same curve against complexity instead of time. Too little complexity sits on the underfitting side with both errors high. Too much sits on the overfitting side with a wide gap. The bottom of the validation curve is the balance you want.
Key idea
The validation curve shows when the model stops generalizing; its lowest validation point marks the best model to keep.