What Rate Limiting Is
Rate limiting caps how many requests a client may make in a window of time. As a security control it slows automated abuse, making brute force, scraping, and resource exhaustion far less practical even when an endpoint is otherwise functioning correctly.
Common Strategies
- A token bucket refills allowance steadily and permits short bursts up to a cap.
- A fixed or sliding window counts requests per interval per key.
- Limits are keyed by client identity, such as account, API key, or source address, sometimes combined.
Using It For Security
- Apply tighter limits on sensitive endpoints like login, password reset, and token issuance.
- Return a clear retry after response so well behaved clients back off.
- Enforce limits at the edge so abusive traffic is shed before reaching app servers.
- Pair with monitoring, since a spike in throttling is a useful attack signal.
Key idea
Rate limiting raises the cost of automation, so cap requests per client, tighten limits on sensitive endpoints, and enforce at the edge.