What it is
A canary rollout releases a new model to a small fraction of live traffic first, such as one percent. You watch its health metrics, and only if they look good do you gradually raise its share until it takes all traffic.
The steps
- Route a small slice of requests to the new model and the rest to the old one
- Monitor error rate, latency, and key prediction metrics on the canary slice
- If everything is healthy, increase the slice in stages
- If something breaks, route all traffic back to the old model
Why it limits blast radius
The point is to bound the damage. If the new model is bad, only the small canary slice is affected, and you can roll back fast. This is far safer than flipping every user to a new model at once.
Canary versus other strategies
A canary differs from a shadow deployment because the canary actually serves users, so it can measure real impact. It differs from an A B test because its main goal is safety and gradual exposure, not a careful statistical comparison.
Key idea
A canary rollout exposes a new model to a small traffic slice first so failures stay small and easy to undo.