The problem
Some work in a request is slow: sending email, resizing images, generating reports. Doing it inline makes the user wait and ties up a web worker. Async processing offload moves that work out of the request path.
How it works
The request does the minimum needed to respond, then hands the slow task to a background worker through a queue or task system.
- The user gets a fast response, often an acknowledgement.
- A separate worker pool processes the task later.
- The result is delivered by polling, a webhook, or a notification.
Why it scales
- Web servers stay free to handle more incoming requests.
- Workers can scale independently based on backlog.
- A slow downstream service no longer blocks user facing latency.
You trade immediate completion for responsiveness, so the design must handle eventual results and retries.
Key idea
Defer slow work to background workers so requests stay fast and the tiers scale apart.