What a load balancer does
A load balancer sits in front of a pool of identical servers and spreads incoming requests across them. To clients it looks like a single endpoint, while behind it any number of machines can come and go.
The goals are simple:
- Spread load so no single server is overwhelmed.
- Hide failures by routing around unhealthy hosts.
- Scale out by adding servers without changing client code.
Common algorithms
- Round robin sends each new request to the next server in order. Simple and fair when servers are equal.
- Least connections picks the server with the fewest active requests, which helps when requests vary in cost.
- Hashing maps a key such as a user id to a fixed server, giving sticky routing.
Layer 4 vs layer 7
A layer 4 balancer routes on IP and port without reading the payload, so it is fast. A layer 7 balancer understands HTTP, so it can route by path or header and terminate TLS.
Health checks let the balancer probe each server and stop sending traffic to ones that fail.
Key idea
A load balancer turns a pool of replaceable servers into one reliable address.