Replication

How keeping copies of your data on several machines buys you durability and faster reads.

Copies on purpose

Replication keeps the same data on more than one node. It serves two goals at once: surviving the loss of a machine, and serving reads from whichever copy is closest or least busy.

The common shape is leader and follower. Writes go to the leader, which streams its changes to followers. Reads can be served by any replica.

Sync versus async

The big trade is when the leader confirms a write.

Synchronous replication waits until a follower has the data, so no acknowledged write is lost, but the leader stalls if a follower is slow.
Asynchronous replication confirms immediately and ships changes in the background, which is fast but can lose recent writes if the leader dies.

Replication lag

Because followers trail the leader, a read from a follower may return slightly stale data. This replication lag can surprise a user who writes then immediately reads from a lagging replica. Routing a user back to the leader for their own recent reads is one common fix.

Key idea

Replication trades a little staleness or write latency for durability and scalable reads.

Copies on purpose

Sync versus async

Replication lag

Key idea

Check yourself