Graceful Connection Draining

The problem with hard shutdowns

When a server is taken out of rotation for a deploy or scale down, simply killing it drops every request it was handling and any new ones still arriving. Graceful connection draining removes a server cleanly, letting existing work finish while new work goes elsewhere.

The draining sequence

The instance is marked unhealthy or out of service so the load balancer stops sending new requests to it.
In flight requests are allowed to complete, up to a bounded drain timeout.
Once connections close or the timeout elapses, the instance shuts down.

A good drain also sends a connection close signal so keep alive clients know to reconnect elsewhere rather than reuse a connection about to vanish. The drain timeout bounds how long a deploy waits, since a stuck request should not block shutdown forever. Draining is what lets rolling deployments and autoscaling happen without users seeing errors, making it a quiet but essential part of zero downtime operations.

Key idea

Graceful draining stops new traffic to a server, lets in flight requests finish within a timeout, then shuts down, enabling zero downtime deploys and scaling.

Graceful Connection Draining

The problem with hard shutdowns

The draining sequence

Key idea

Check yourself