← Lessons

quiz vs the machine

Gold1420

Concurrency

The Token Bucket Rate Limiter Concurrency

Limiting request rate while allowing controlled bursts, safely across threads.

5 min read · core · beat Gold to climb

The Token Bucket Rate Limiter Concurrency

A token bucket controls how fast work is allowed to proceed. Picture a bucket that holds tokens up to a capacity. Tokens are added at a steady refill rate. Each request must take one token to proceed; if the bucket is empty, the request waits or is rejected.

Two numbers define the behavior. The refill rate sets the long term average allowed rate. The capacity sets how large a burst can be absorbed when tokens have accumulated during a quiet period. This lets the limiter smooth traffic while still tolerating short spikes.

Under concurrency the bucket is shared, so updates must be safe. Many threads may try to take a token at once, so the take operation must atomically check and decrement, otherwise two requests could spend the same token.

  • Average control Refill rate bounds sustained throughput.
  • Burst control Capacity bounds the size of an instantaneous spike.
  • Atomic take Concurrent requests need an atomic check and decrement to avoid double spending tokens.

The token bucket is widely used at API gateways and in client libraries because it is simple, allows bursts, and bounds load on whatever sits behind it.

Key idea

A token bucket bounds average rate by refill speed and burst size by capacity, and concurrent takes must atomically decrement so tokens are never double spent.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the token bucket refill rate control?

2. What does the bucket capacity govern?

3. Why must taking a token be atomic under concurrency?