Monitoring the outputs
When labels are delayed, the model's predictions themselves become a monitoring signal. If the distribution of scores or predicted classes shifts sharply, something changed even before truth confirms it.
What a shift can mean
- Upstream data broke, feeding the model garbage that pushes outputs to extremes.
- Real world change, where the population genuinely shifted.
- A bad deploy, where a new model version scores very differently.
Signals to track
- The average predicted score and its spread over time.
- The class balance of predictions versus a historical baseline.
- The rate of predictions near a decision threshold.
A caution
A prediction shift is ambiguous. It can reflect genuine change or a defect. Pair it with input drift and operational metrics to narrow down the cause before acting.
Key idea
Prediction distribution shift uses the model outputs as an early label free signal, but it is ambiguous and must be paired with input drift to separate real change from defects.