The idea
Horizontal scaling means handling more load by adding more machines rather than buying a bigger one. The trick that makes it cheap is keeping each server stateless: it holds no session or user data between requests.
Why stateless matters
- Any server can answer any request, so a load balancer can route freely.
- A crashed server loses nothing important, so recovery is just replacing it.
- You can add or remove instances during a traffic spike without migrating data.
Where the state goes
The state still exists, it just moves to a shared backing store.
- Session data lives in a shared cache like Redis.
- Persistent data lives in a database.
- Files live in object storage.
The servers become interchangeable workers, and the hard problem of holding state is pushed to a few systems built for it.
Key idea
Make the application tier stateless so you can scale it by simply adding identical replicas.