The problem with local counters
If each server keeps its own in memory counter, a client allowed one hundred per minute can actually do one hundred per server. Behind a load balancer with ten servers that becomes one thousand. The limit must be global, shared across the fleet.
Why Redis
A central Redis instance gives every server a single shared counter keyed by client identity. Each server increments the same key, so the limit is enforced once across all of them. Redis is chosen because it is fast, in memory, and offers atomic operations like increment and expire.
What must be handled
- Set a time to live on the key so the window resets automatically.
- Use atomic increment so concurrent servers do not lose updates.
- Plan for the network hop adding latency to every request.
- Decide what to do if Redis is unreachable: fail open and allow, or fail closed and reject.
Key idea
Distributed rate limiting moves the counter into a shared store like Redis so a whole fleet enforces one global limit.