← Lessons

quiz vs the machine

Platinum1780

Machine Learning

Canary Deploys For Models

Send a sliver of traffic to a new model before trusting it.

5 min read · advanced · beat Platinum to climb

The risk of a full swap

Replacing a serving model all at once is dangerous. A new version can be slower, more expensive, or quietly worse on real inputs in ways tests miss. A canary deploy limits that blast radius.

How a canary works

A small slice of live traffic, perhaps one or five percent, is routed to the new model while the rest stays on the stable one. The team watches metrics for the canary slice and compares them to the stable baseline.

What to compare

  • Latency and cost to catch performance regressions.
  • Quality signals such as user feedback, click through, or error rates.
  • Output drift where the new model answers very differently.

Promote or roll back

If the canary looks healthy, traffic is shifted gradually until the new model serves everyone. If metrics worsen, traffic is pulled back to the stable model immediately. The small initial slice means few users ever see a bad version.

Key idea

A canary deploy routes a small slice of live traffic to a new model and compares its latency, cost, and quality against the stable baseline. Healthy canaries are promoted gradually and bad ones rolled back fast, so few users ever meet a flawed version.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the main benefit of a canary deploy for a model?

2. What happens if canary metrics look worse than the baseline?