The core trade
High availability means the system keeps serving despite the loss of individual components. The price is redundancy: running more copies than the minimum so that losing one does not lose the service.
Redundancy versus replication
- Redundancy is having spare capacity, like a second server that can take over.
- Replication is keeping multiple copies of data in sync so any copy can serve reads or become the new primary.
Redundancy without replication still loses data when a node dies. Replication makes the spare actually useful.
How many copies
Availability comes from independence. Two copies on the same rack share a power supply, so they fail together. Spreading replicas across racks, zones, or regions removes shared failure domains.
- More replicas means more durability and read capacity.
- More replicas also means more cost and more coordination on writes.
The hidden cost
Every write must reach enough replicas to stay safe. Synchronous replication waits for copies and adds latency, while asynchronous replication is faster but can lose recent writes on failure.
Key idea
Availability is bought with independent redundant copies, and replication is what makes those copies useful.