Connection Draining, Deep

Retiring a backend gracefully by letting in-flight requests finish before removal.

Leaving without slamming the door

When you deploy, scale down, or replace a backend, abruptly cutting it off kills in-flight requests and drops user sessions. Connection draining, also called graceful removal, stops sending new traffic to a backend while letting existing requests complete.

How it works

The backend is marked draining in the balancer.
New connections and requests go to other backends.
Existing connections finish naturally, up to a drain timeout.
After the timeout, any stragglers are force closed.

Why the timeout matters

A drain timeout bounds how long you wait. Set it too short and long requests get cut; set it too long and a deploy crawls.

Short timeout suits APIs with millisecond requests.
Longer timeout suits uploads, streaming, or batch calls.

Coordinating with health

Draining pairs with health checks and deploy orchestration. A common pattern is to mark a node unhealthy or send a shutdown signal, wait for the drain window, then terminate. This gives zero downtime rollouts because users on the old node finish cleanly while new traffic flows elsewhere.

Key idea