← Lessons

quiz vs the machine

Gold1400

System Design

The Token Bucket Algorithm

Refill tokens at a steady rate and spend one per request, allowing controlled bursts.

5 min read · core · beat Gold to climb

How it works

A token bucket holds up to a fixed capacity of tokens. Tokens are added at a steady refill rate, for example ten per second, but the bucket never overflows past its capacity. Each request must take one token to proceed. If the bucket is empty, the request is rejected or made to wait.

Why it allows bursts

Because tokens accumulate up to the capacity while traffic is idle, a client can spend a saved up pile all at once. This permits a burst equal to the bucket size, then settles back to the steady refill rate. The average rate is bounded by the refill, while short spikes up to the capacity are allowed.

Tuning the two knobs

  • Refill rate sets the sustained throughput.
  • Capacity sets how large a burst you tolerate.

This separation is why token bucket is the most common choice for public APIs: it caps the long run rate while still feeling responsive to bursty clients.

Key idea

A token bucket bounds the average rate by its refill while permitting bursts up to its capacity.

Check yourself

Answer to earn rating on the learn ladder.

1. What lets a token bucket permit bursts?

2. Which knob sets the sustained throughput?

3. What happens when the bucket is empty?