What lag means
In a log based broker, each message has an offset. The producer keeps writing new offsets while consumers commit the offset they have processed. Consumer lag is the gap between the latest produced offset and the consumer committed offset. It tells you how many messages are waiting.
Reading the signal
- Lag near zero means consumers keep up with producers.
- Steadily rising lag means consumers are slower than producers and the backlog will grow without bound.
- A sudden spike often means a stuck or crashed consumer.
Lag can be measured in messages or estimated in time, meaning how old the oldest unprocessed message is. Time based lag is often more meaningful for service level objectives.
Acting on it
- Alert when lag crosses a threshold tied to your latency budget.
- Add consumers or partitions to raise throughput, remembering you cannot exceed one consumer per partition.
- Investigate poison messages or slow downstream calls when lag climbs despite enough consumers.
Key idea
Consumer lag is the offset gap between produced and processed messages and rising lag is the early warning to scale or fix consumers.