The problem with naive retries
When a client gets a 429 or a 503, retrying immediately makes things worse. A flood of instant retries can keep a struggling server pinned. Worse, if many clients fail at the same moment and all retry on the same schedule, they form a thundering herd that hits the server in synchronized waves.
Exponential backoff
The fix is exponential backoff: wait a base delay, then double it after each failure, up to a cap. Delays might be one, two, four, then eight seconds. This rapidly thins the retry rate so the server gets room to recover.
Adding jitter
Backoff alone still lets clients retry in lockstep. Adding jitter, a random offset on each delay, spreads retries across time and breaks the synchronized waves. Full jitter picks a random wait between zero and the current backoff ceiling.
Honoring retry after
If the server sends a retry after header, the client should respect it directly rather than guessing, since the server knows when its window resets.
Key idea
Retry with exponential backoff plus jitter, and honor retry after, so throttled clients let the server recover instead of stampeding it.