Two layers, two views
A load balancer spreads traffic across backend servers. Where in the stack it operates decides what it can inspect and how it decides.
Layer 4 balancing
An L4 balancer works at the transport layer. It sees IP addresses and TCP or UDP ports but not the payload.
- It forwards connections without parsing requests.
- It is fast and cheap because there is little to process.
- It cannot route on URL, header, or cookie.
Because the whole connection sticks to one backend, L4 is well suited to long-lived streams and raw throughput.
Layer 7 balancing
An L7 balancer terminates the connection and reads the application message, usually HTTP.
- It can route by path, host header, or method.
- It can rewrite headers, cache, compress, and terminate TLS.
- It can retry a failed request on another backend.
This power costs CPU and memory, since the balancer must buffer and parse each request.
Choosing
Use L4 when you need speed and protocol independence. Use L7 when routing decisions depend on request content.
Key idea
L4 forwards connections blindly and fast; L7 understands requests and routes intelligently at a higher cost.