← Lessons

quiz vs the machine

Platinum1760

System Design

The Distributed Rate Limiter

Enforcing a global request limit across many servers sharing one budget.

5 min read · advanced · beat Platinum to climb

The shared budget problem

Each server can rate limit its own traffic easily. But a global limit, like one thousand requests per second per customer, spans many servers that must share a single budget without exceeding it.

Approaches

  • Centralized counter: every server checks a shared store like Redis. Accurate but adds latency and a hot key.
  • Token bucket per node: split the global budget across nodes. Fast but wastes budget when traffic is uneven.
  • Sliding window with sharing: nodes periodically report usage and adjust their local allowance.

The trade

The core tension is accuracy versus latency. A central check is precise but slow and a bottleneck; local buckets are fast but can drift above or below the true global limit. Many systems pick local enforcement with periodic reconciliation for a practical balance.

Key idea

A distributed rate limiter shares one budget across servers, trading central accuracy against local speed, often using local buckets reconciled periodically.

Check yourself

Answer to earn rating on the learn ladder.

1. What is the central trade in distributed rate limiting?

2. Why can splitting the global budget across nodes waste capacity?