Knowing who is alive
A balancer must only send traffic to backends that can serve it. Health checking continuously probes each backend and removes failing ones from the rotation.
Active versus passive
- Active checks send probes on a schedule, such as an HTTP request to a health endpoint or a TCP connect. The balancer decides health from the response.
- Passive checks observe real traffic: a burst of errors or timeouts marks a backend unhealthy without a dedicated probe.
Many systems combine both for fast detection with low overhead.
Thresholds and flapping
A single failed probe is noisy. Balancers use thresholds:
- Unhealthy threshold: how many consecutive failures before eviction.
- Healthy threshold: how many consecutive successes before return.
- Interval and timeout: how often to probe and how long to wait.
These hysteresis rules prevent flapping, where a borderline backend rapidly toggles in and out.
Shallow versus deep
A shallow check confirms the process answers. A deep check verifies dependencies like the database are reachable, catching backends that are up but unable to serve. Deep checks are more honest but can cascade failures if a shared dependency blips.
Key idea
Health checks, active or passive, with thresholds to prevent flapping, keep traffic flowing only to backends that can actually serve it.