← Lessons

quiz vs the machine

Silver1120

Concurrency

The Queue Depth and Latency

Why a deep work queue trades throughput for ever growing response time.

4 min read · intro · beat Silver to climb

A queue is a buffer and a delay

A work queue sits in front of a pool of workers. It smooths bursts so workers stay busy, but every item that waits in the queue adds to that item's latency.

The depth tradeoff

  • A deep queue absorbs big bursts and keeps throughput high, but items can wait a long time before a worker picks them up.
  • A shallow queue keeps latency low because items are served quickly, but a burst that overflows it gets rejected.

The time an item spends in the queue is roughly its position divided by the service rate. Double the depth and you roughly double the worst case wait.

The hidden cost of unbounded queues

An unbounded queue never rejects work, which sounds friendly, but under sustained overload it grows without limit. Latency climbs until requests time out anyway, and memory can exhaust. Bounding the queue lets you fail fast instead of failing slow.

Key idea

Queue depth trades throughput for latency, and an unbounded queue under overload turns into unbounded latency, so bound it and reject early.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a deeper queue cost?

2. Why can an unbounded queue be dangerous under sustained overload?