Taking turns
Round robin is the most basic balancing policy. Requests are handed to backends in a fixed cycle: server one, then two, then three, then back to one.
- It needs almost no state, just a counter.
- It assumes every backend is equally capable.
- It assumes every request costs about the same.
When those assumptions hold, round robin spreads load evenly and predictably.
When servers differ
Real fleets are rarely uniform. Some machines have more cores or memory. Weighted round robin assigns each backend a weight and gives more turns to higher-weighted servers.
- A server with weight three receives three times the requests of a weight one server.
- Weights let you mix old and new hardware in one pool.
- You can drain a node by lowering its weight toward zero.
Limits
Both policies are blind to actual load. A round robin pool can still overload one server if its requests happen to be slow, because the algorithm never looks at response time or open connections.
Key idea
Round robin cycles requests evenly; weighted round robin biases the cycle toward stronger servers, but neither reacts to real-time load.