← Lessons

quiz vs the machine

Gold1370

Networking

Connection Draining

How a backend leaves the pool without dropping in flight requests.

4 min read · core · beat Gold to climb

Leaving gracefully

When a backend is being shut down, deployed, or scaled in, you do not want to kill it the instant it stops being needed. In flight requests would fail. Connection draining, also called graceful shutdown, lets a backend finish its current work before it disappears.

How draining works

The load balancer marks the backend as draining:

  • It stops sending new requests to that instance.
  • It lets existing requests run to completion, up to a timeout.
  • Once active requests finish or the timeout expires, the instance is fully removed.

Why the timeout matters

Draining needs a bound. A backend with a stuck long lived request should not block a deploy forever, so there is a maximum drain time after which remaining connections are closed. Setting it too short cuts off legitimate requests; too long slows rollouts. Teams tune it to the typical request duration plus a margin.

This pattern is what makes rolling deploys and autoscaling safe: instances come and go without users seeing reset connections.

Key idea

Connection draining stops new traffic to a leaving backend while letting in flight requests finish, bounded by a timeout for safe rollouts.

Check yourself

Answer to earn rating on the learn ladder.

1. What does connection draining do when a backend is removed?

2. Why does draining have a maximum timeout?