Why limit per user
A burst of events can flood a single user with dozens of pushes. Rate limiting caps how many notifications a user receives in a window, protecting attention and your sender reputation.
Common algorithms
- A token bucket gives each user tokens that refill over time; each send spends one.
- A fixed or sliding window counts sends per user per interval and rejects beyond the cap.
Limits are often per category so critical alerts are not blocked by chatty marketing.
What happens at the cap
When a user hits the limit the system can drop, delay, or roll the message into a digest rather than send immediately. Critical messages bypass the cap entirely.
Distributed counters
In a sharded system the counter must be shared, usually in a fast store like Redis, so all workers see the same count for a user.
Key idea
Per user rate limiting caps message volume with buckets or windows and defers or digests overflow instead of spamming.