← Lessons

quiz vs the machine

Platinum1820

System Design

Adaptive Rate Limiting

Adjust the limit automatically from live health signals instead of a fixed hand tuned number.

6 min read · advanced · beat Platinum to climb

Beyond a static number

A fixed limit is a guess. Set it too low and you waste capacity in quiet times; set it too high and you melt the system under load. Adaptive rate limiting adjusts the allowed rate in real time based on how the system is actually doing.

Signals it watches

  • Latency, especially tail percentiles, rising as the system strains.
  • Error rate, including timeouts and rejections from downstream.
  • Queue depth or resource saturation like CPU.

When these signals show stress the limiter tightens the allowed rate; when they look healthy it loosens it. The result tracks the true serving capacity, which itself changes with deployments, traffic mix, and hardware.

A control loop view

Adaptive limiting is a feedback control loop, conceptually like additive increase multiplicative decrease: probe upward slowly while healthy, cut sharply at the first sign of overload. The aim is to sit just below the point where latency spikes.

The risks

  • A noisy signal can cause oscillation, swinging the limit wildly.
  • Cutting too aggressively can starve legitimate traffic.
  • It is harder to reason about and test than a fixed number.

Key idea

Adaptive rate limiting moves the limit in a feedback loop driven by live health signals so it tracks real capacity instead of a fixed guess.

Check yourself

Answer to earn rating on the learn ladder.

1. What drives an adaptive limiter to change the rate?

2. What pattern does adaptive limiting often resemble?

3. What is a key risk of adaptive limiting?