The Long Lived Connection Load Balancing

Spreading persistent connections evenly when standard request balancing assumptions no longer hold.

Why normal balancing breaks

A load balancer for HTTP spreads short requests, so any imbalance corrects itself within seconds. A persistent connection stays on one node for hours, so a single bad placement decision lasts the whole session.

The consequences

A node that gets a burst of connections during a deploy stays hot long after.
Round robin counts connections, not work, so a node full of chatty clients overloads while a quiet node idles.
Restarting a node drops all its connections at once, causing a reconnect surge.

Balancing strategies

Balance on active connection count rather than request count.
Drain a node before deploy so its clients reconnect gradually across the fleet.
Let clients reconnect with jitter so a node restart does not refill one node instantly.

Key idea

Balancing long lived connections requires counting active connections and draining nodes gracefully, because a single placement decision sticks for the entire session rather than self correcting like short requests.

The Long Lived Connection Load Balancing

Why normal balancing breaks

The consequences

Balancing strategies

Key idea

Check yourself