The concurrency problem
In a leaderless system, two clients can write the same key at the same time on different nodes. The system must decide whether one write happened before the other, in which case the later one wins, or whether they are concurrent, in which case both must be kept as a conflict.
Wall clock timestamps are unreliable for this because clocks drift. Instead we track causality.
Version numbers and vectors
Each write carries a version. A single node assigns increasing version numbers, and a client must include the version it read when writing back. If the new write builds on the latest version, it supersedes the old value.
With multiple nodes, a single counter is not enough. A version vector holds one counter per node. Comparing two vectors reveals the relationship:
- If every entry of one is at least as large as the other, one dominates and is newer.
- If each has some entry larger than the other, they are concurrent and conflict.
Concurrent values, called siblings, are returned together so the application or a later write can merge them.
Key idea
Version vectors detect concurrency by tracking causality per node, so the system knows when to overwrite and when to preserve conflicting siblings.