The Read Replica Routing

Splitting reads from writes

A common scaling pattern sends all writes to the primary and spreads read only queries across read replicas. A proxy or application router directs traffic, often by inspecting whether a statement is a SELECT or a write.

The lag problem

Replicas apply changes after the primary commits, so they trail by some replication lag. A user who writes and then immediately reads from a lagging replica may not see their own change. This breaks read your writes consistency.

Mitigations

Route after write reads to the primary for a short window so a user sees their own change.
Track the GTID or binlog position the write reached and only serve the read from a replica that has applied that far, sometimes called replica wait for position.
Send latency sensitive or strongly consistent reads to the primary deliberately.

Operational care

Monitor seconds behind master and remove badly lagging replicas from the read pool.
Beware that aggregate read scaling does not help write heavy workloads, since every replica still applies every write.

Key idea

Read replica routing scales reads by sending writes to the primary and reads to replicas, but replication lag means consistency sensitive reads must wait for a position or go to the primary.

The Read Replica Routing

Splitting reads from writes

The lag problem

Mitigations

Operational care

Key idea

Check yourself