What a barrier guarantees
A barrier is a synchronization point: no participant proceeds past it until all participants have reached it. In a single process this is a counter and a condition variable. Across a network it needs a coordinator and a place to record arrivals that survives crashes.
A ZooKeeper style barrier
Using a coordination store, a common pattern is:
- Each node creates a child under a barrier parent to announce it has arrived.
- A node then checks whether the number of children equals the expected total.
- If not, it watches the barrier node and blocks. When the count reaches the total, the watch fires and everyone proceeds.
The double barrier
Real workloads need to synchronize at the start and the end of a phase. A double barrier adds a leave protocol: each node removes its child when done, and waits until the count drops to zero before exiting. This ensures no node races ahead into the next phase while others are still finishing.
- Entry barrier: wait until all have arrived.
- Exit barrier: wait until all have left.
Key idea
A distributed barrier blocks every participant until all arrive, and a double barrier adds a symmetric leave step so phases do not overlap.