Timeout Budgets And Cascading

Timeouts are mandatory

A call with no timeout can hang forever, holding a thread or connection. Under load those held resources pile up until the whole service stalls. So every remote call needs a timeout that bounds how long it can wait.

The cascade problem

Services call services. If a frontend waits ten seconds for service A, which waits ten seconds for service B, the user can wait far longer than any single timeout. Worse, when B is slow, A piles up waiting threads and fails too, and the slowness cascades upward.

Timeout budgets

The fix is a timeout budget: a single deadline for the whole request that is passed down and decremented at each hop.

The frontend allots, say, three seconds total.
After spending one second reaching A, only two seconds remain for B.
If the remaining budget is near zero, do not even start the next call.

This way no downstream call can wait longer than the time the user is actually willing to give.

Pairing with circuit breakers

Timeouts detect slowness per call, while a circuit breaker trips after repeated timeouts to stop hammering a sick dependency entirely.

Key idea