← Lessons

quiz vs the machine

Platinum1740

Machine Learning

Model Calibration

Making predicted probabilities mean what they say.

5 min read · advanced · beat Platinum to climb

What calibration means

A model is calibrated when its confidence matches reality. Among all predictions made with seventy percent confidence, about seventy percent should be correct. A model can be highly accurate yet poorly calibrated, reporting ninety nine percent confidence when it is right only eighty percent of the time.

Why it matters

Probabilities feed downstream decisions, risk thresholds, and human trust. Overconfident outputs lead to bad automated choices and miscalibrated risk. Modern deep networks tend to be overconfident by default.

Measuring it

  • A reliability diagram plots predicted confidence against observed accuracy
  • The expected calibration error averages the gap between confidence and accuracy across bins
  • A perfectly calibrated model lies on the diagonal

Fixing it

Calibration is usually a cheap post processing step fit on held out data.

  • Temperature scaling divides the logits by a single learned temperature, softening or sharpening the probabilities without changing the predicted class
  • Platt scaling fits a logistic transform to the scores
  • Isotonic regression fits a flexible monotonic mapping

Temperature scaling is the most popular because it is simple, leaves accuracy untouched, and only needs one parameter tuned on a validation set.

Key idea

Calibration aligns predicted confidence with real accuracy, often via cheap post hoc methods like temperature scaling.

Check yourself

Answer to earn rating on the learn ladder.

1. What does it mean for a model to be well calibrated?

2. What does temperature scaling change?

3. Which metric summarizes calibration quality across bins?