Changing Shape While Live
Adding a column or index to a distributed table is risky: different nodes learn about the change at different moments. If one node enforces a new constraint while another does not, data can become inconsistent. The solution is a staged rollout through intermediate schema states.
The Intermediate States
Inspired by the Google F1 schema change protocol, a change moves through states like delete only and write only before becoming public.
- Delete only: the new index can be deleted from but not written or read.
- Write only: writes maintain the index but reads ignore it.
- A backfill then populates existing rows.
- Public: the index is fully usable.
Two adjacent states are always compatible, so nodes lagging by one state can never corrupt data.
Why It Is Safe
By guaranteeing that the cluster only ever spans two adjacent compatible states at once, the protocol lets the change propagate node by node without a global stop. No long table lock is needed, so the database stays online.
Key idea
Online schema change rolls through compatible intermediate states like delete only and write only, so nodes lagging by one state never corrupt data and the table stays available.