Reverting fast and safely
A rollback returns serving to the last known good model version. Triggers are the predefined conditions that fire a rollback, ideally automatically, so a bad deploy is reverted in minutes rather than after a long investigation.
Good trigger signals
- Quality drop, accuracy or a proxy metric falls below a floor.
- Operational breach, error rate or latency exceeds limits.
- Prediction anomaly, output distribution shifts sharply from baseline.
- Guardrail breach, a business metric like revenue per session regresses.
Making rollback reliable
- Keep the previous version warm so reverting is instant, not a fresh deploy.
- Version models and data so the exact good state is reproducible.
- Test the rollback path itself, since an untested revert can fail when most needed.
Avoiding flapping
Require a trigger to hold for a short sustained window so a brief blip does not cause a noisy revert back and forth.
Key idea
Rollback triggers are predefined automatic conditions on quality, operations, and guardrails that revert to a warm last good model, with sustained windows to avoid flapping.