What calibration means
A model is calibrated when its predicted probabilities match observed frequencies. Among cases it calls seventy percent likely, about seventy percent should truly be positive. A confident but miscalibrated model misleads any downstream decision.
Why models drift off
- Many classifiers output scores that rank well but are not true probabilities.
- Margin based models like support vector machines have no native probability at all.
- Heavily regularized or boosted models often skew toward extreme or central values.
Checking it
- Plot a reliability diagram, predicted probability against actual frequency in bins.
- A perfectly calibrated model sits on the diagonal.
- Summarize the gap with metrics such as the expected calibration error.
Fixing it
- Platt scaling fits a logistic curve to the model scores.
- Isotonic regression fits a flexible nondecreasing mapping, needing more data.
- Always calibrate on a separate held out split to avoid optimistic bias.
Key idea
Calibration aligns predicted probabilities with real frequencies. Reliability diagrams diagnose it, and Platt scaling or isotonic regression on held out data fix it.