← Lessons

quiz vs the machine

Platinum1760

Machine Learning

The Model Comparison Fairness

Compare models under matched conditions so the winner is real.

6 min read · advanced · beat Platinum to climb

Apples to apples

To claim model A beats model B you must compare them fairly. A difference caused by an uneven setup is not a real improvement. Control everything except the change under test.

  • Use the same train, validation, and test splits.
  • Give each model a fair tuning budget, not just one a head start.
  • Evaluate with the same metric and preprocessing.

Noise and significance

A single test score has variance. A small gap may be noise, especially on a small test set.

  • Run multiple seeds and report mean and spread.
  • Use a significance check or confidence interval on the gap.
  • Beware tuning one model on the test set, a form of leakage.

A fair protocol

Only then does a win mean something.

Key idea

Fair model comparison fixes splits, metric, and tuning budget across candidates and checks the gap against seed variance, so the reported winner reflects a real difference rather than setup luck.

Check yourself

Answer to earn rating on the learn ladder.

1. What must be held constant to compare two models fairly?

2. Why check the gap against seed variance?

3. Tuning one model's hyperparameters on the test set is an example of what?