The budget hidden in your SLO
If your service level objective is ninety nine point nine percent success, then one tenth of one percent of requests are allowed to fail. That allowance is your error budget. It is permission to be imperfect, spent over the measurement window.
Why a budget changes behavior
The error budget turns reliability from an argument into accounting. Instead of debating whether a risky launch is acceptable, teams ask whether the budget can pay for it.
- When the budget is healthy, teams can ship faster and take more risk.
- When the budget is exhausted, the system is already too unreliable, so risky work pauses.
The error budget policy
A policy is the agreed rule for what happens at each state. A typical policy freezes feature launches and redirects effort to reliability work once the budget is spent, and unfreezes when it recovers. Writing this down before an incident removes emotion from the decision.
Key idea
An error budget is the spendable gap between perfect and your SLO, and a written policy decides how that currency governs risk.