← Lessons

quiz vs the machine

Platinum1820

System Design

The Canary Deployment Analysis

Send a sliver of traffic to a new version and compare it before going wide.

6 min read · advanced · beat Platinum to climb

Canary deployment analysis

A canary deployment releases a new version to a small slice of traffic first, watches how it behaves, and only widens the rollout if it looks healthy. The name comes from the canary in a coal mine, an early warning before the whole fleet is exposed.

The analysis is the point

Routing one percent of traffic is easy. The hard and valuable part is canary analysis, deciding whether the canary is actually fine. Good analysis compares the canary against a baseline of the old version running at the same time, so differences in traffic mix do not fool you.

  • Compare error rate, latency percentiles, and saturation between canary and baseline
  • Use enough traffic and time for the comparison to be statistically meaningful
  • Automate the decision so a bad canary is rolled back without waiting on a human

Why compare to a live baseline

Comparing the canary to yesterday is misleading, because load and user mix change hour to hour. A concurrent baseline running the old code absorbs those swings, isolating the effect of the new version.

Progressive rollout

If analysis passes, increase the canary share in steps, one percent, then ten, then fifty, re evaluating at each stage before full rollout.

Key idea

A canary exposes a new version to a small slice and compares it against a concurrent baseline, automating rollback so only proven versions widen.

Check yourself

Answer to earn rating on the learn ladder.

1. Why compare a canary against a concurrent baseline rather than yesterday?

2. What is the most valuable hard part of a canary deployment?

3. What should a healthy canary analysis trigger?