← Lessons

quiz vs the machine

Silver1100

Machine Learning

The Train Validation Test Split

Separate data into three roles to tune honestly and report unbiased results.

3 min read · intro · beat Silver to climb

The Train Validation Test Split

Reliable evaluation needs three distinct data roles: train, validation, and test. Mixing these roles is the fastest way to fool yourself with optimistic numbers.

The three roles

  • Training set fits the model parameters such as weights.
  • Validation set guides decisions like hyperparameters, model choice, and early stopping.
  • Test set is touched only once at the end to report an unbiased estimate.

Why three not two

If you tune on the test set, you implicitly fit to it and your reported score becomes optimistic. The validation set absorbs all the tuning decisions, protecting the test set from contamination.

Good practice

  • Split before any preprocessing that learns from data, to avoid leakage.
  • Keep the test set locked away until the very end.
  • Use proportions like 70 20 10 or 80 10 10, adjusting for dataset size.

Key idea

Train fits parameters, validation guides tuning, and the test set is touched once for an unbiased report, so keeping these roles separate prevents optimistic, contaminated results.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the validation set used for?

2. Why must the test set be touched only once?