← Lessons

quiz vs the machine

Gold1470

System Design

The Long Lived Connection Load Balancing

Spreading persistent connections evenly when standard request balancing assumptions no longer hold.

5 min read · core · beat Gold to climb

Why normal balancing breaks

A load balancer for HTTP spreads short requests, so any imbalance corrects itself within seconds. A persistent connection stays on one node for hours, so a single bad placement decision lasts the whole session.

The consequences

  • A node that gets a burst of connections during a deploy stays hot long after.
  • Round robin counts connections, not work, so a node full of chatty clients overloads while a quiet node idles.
  • Restarting a node drops all its connections at once, causing a reconnect surge.

Balancing strategies

  • Balance on active connection count rather than request count.
  • Drain a node before deploy so its clients reconnect gradually across the fleet.
  • Let clients reconnect with jitter so a node restart does not refill one node instantly.

Key idea

Balancing long lived connections requires counting active connections and draining nodes gracefully, because a single placement decision sticks for the entire session rather than self correcting like short requests.

Check yourself

Answer to earn rating on the learn ladder.

1. Why is balancing persistent connections harder than balancing HTTP requests?

2. What should you do to a node before deploying to avoid a reconnect surge?