From indicator to objective
A service level objective, or SLO, is a target for an SLI over a window, for example ninety nine point nine percent of requests succeed over thirty days. It states how good is good enough, and it is deliberately below one hundred percent.
The error budget
If the SLO is ninety nine point nine percent, then one tenth of one percent of requests are allowed to fail. That allowance is the error budget: the amount of unreliability you may spend before breaching the objective.
- A healthy budget means you can take risks, ship features, and run experiments.
- A spent budget means you should freeze risky changes and focus on reliability.
This converts an emotional argument about stability into a numeric decision that both product and operations can share.
Burn rate
The burn rate is how fast you are consuming the budget relative to the window. A burn rate of one exactly spends the budget over the period. A high burn rate means a fast moving incident, and alerting on burn rate catches problems early while sparing you noise during slow drift.
Why not aim for one hundred percent
Perfect reliability costs enormously and is invisible to users beyond a point. The SLO sets the level worth paying for and frees the rest of the budget for velocity.
Key idea
An SLO sets a reliability target below one hundred percent, and its error budget converts reliability into a shared, numeric decision about when to ship versus when to stabilize.