Read Replicas and Read Scaling

Replicas copy the primary so reads can fan out, but replication lag changes what they return.

Why Replicas Exist

Most applications read far more than they write. A single primary can become a bottleneck for reads long before it does for writes. Read replicas are copies of the primary that continuously apply its changes, letting you spread read traffic across many machines.

How It Works

The primary accepts all writes and records them in a replication stream.
Each replica subscribes to that stream and replays the changes to stay current.
Your application sends writes to the primary and routes reads to replicas.

The Catch: Replication Lag

Replication is usually asynchronous, so a replica is always slightly behind. The gap between a write committing on the primary and appearing on a replica is replication lag. A user who writes a comment and immediately reloads might not see it if the read hits a lagging replica. This is called read your own writes inconsistency.

Common Fixes

Route a user to the primary for a short window after they write.
Track a replication position and wait for the replica to catch up.
Accept eventual consistency where staleness is harmless.

Key idea

Read replicas scale reads by copying the primary, but asynchronous replication lag means replicas can return stale data.

Read Replicas and Read Scaling

Why Replicas Exist

How It Works

The Catch: Replication Lag

Common Fixes

Key idea

Check yourself