A spike like no other
A flash sale sells limited stock at a deep discount at a fixed start time. The traffic pattern is brutal: a massive, synchronized surge that can be a hundred times normal load, all contending for the same few products.
Defending the system
- Queue and admit: place arriving users in a virtual waiting room and admit them at a controlled rate so backends are not overwhelmed.
- Shed load early: reject excess requests at the edge with a friendly page rather than letting them crash the core.
- Cache the static parts: serve product pages and assets from a CDN so only the buy action hits the database.
The hot inventory problem
The real bottleneck is decrementing one product's stock under extreme concurrency. Techniques include pre allocating stock into tokens, using atomic counters in an in memory store, and accepting that once tokens run out, remaining requests are rejected fast.
Fairness and correctness
Even under load the system must not oversell. It is better to reject a buyer than to sell stock that does not exist.
Key idea
Flash sales need a waiting room, edge load shedding, and atomic stock tokens so the system never oversells under extreme concurrency.