← Lessons

quiz vs the machine

Gold1410

System Design

Replication for Durability

Keeping multiple copies so data survives disks, machines, and whole regions failing.

5 min read · core · beat Gold to climb

Copies guard against loss

Replication stores the same data on multiple independent devices so that losing one does not lose the data. Durability is the probability that committed data survives, and adding replicas raises it sharply because all replicas must fail at once to cause loss.

Independence is everything

Replicas only help if their failures are independent. Two copies on the same disk, same machine, or same power rail can die together. Strong durability spreads replicas across separate disks, racks, and even data center regions so a single fault cannot take them all.

Synchronous versus asynchronous

  • Synchronous replication acknowledges a write only after enough replicas have it, so no acknowledged write is lost, at the cost of higher latency.
  • Asynchronous replication acknowledges immediately and copies in the background, which is faster but can lose the most recent writes if the primary fails first.

Restoring the count

When a replica is lost, the system re replicates to restore the target number before another failure can cause data loss. Faster repair means a smaller window of vulnerability.

Key idea

Replication raises durability by keeping independent copies across separate failure domains, with synchronous writes preventing data loss and prompt re replication shrinking the window in which further failures could destroy the last copy.

Check yourself

Answer to earn rating on the learn ladder.

1. Why must replicas be spread across separate failure domains?

2. What does synchronous replication guarantee that asynchronous does not?

3. Why does faster re replication improve durability?