Why WebSockets are different
A WebSocket is a persistent two way connection. Unlike HTTP requests it stays open, which changes how you scale.
- Each connection holds server memory and a file descriptor for its lifetime.
- A user connected to one server must still receive messages produced elsewhere.
- Load balancers must support upgrade and keep the connection pinned.
The fan out problem
When one server holds a connection but a message originates on another, you need a backplane that broadcasts across servers.
Scaling techniques
- Use a pub sub backplane so any server can deliver to any connected client.
- Shard connections by room or topic to limit fan out.
- Track presence in a shared store, not in one server's memory.
Operational notes
- Plan for reconnect storms after a deploy or network blip.
- Cap connections per node and add nodes to grow capacity.
- Use heartbeats to detect and clean up dead connections.
Key idea
WebSockets scale by adding a pub sub backplane and sharding by topic, since persistent connections must reach users no matter which server holds them.