← Lessons

quiz vs the machine

Gold1400

System Design

The Consensus Problem

Getting independent nodes to agree on a single value.

5 min read · core · beat Gold to climb

What consensus means

Consensus is the problem of getting a group of nodes to agree on one value even when some nodes fail and messages are delayed. It underlies leader election, distributed locks, and replicated logs.

The properties

A correct consensus protocol must satisfy:

  • Agreement no two nodes decide different values
  • Validity the decided value was actually proposed by someone
  • Termination every non faulty node eventually decides
  • Integrity a node decides at most once

Why it is hard

Nodes have no shared clock and cannot tell a crashed peer from a slow one. A naive vote can stall forever if proposers keep colliding. Real protocols add structure such as rounds, leaders, and quorums to make progress likely.

Where it shows up

  • Replicated state machines apply the same commands in the same order
  • Configuration stores like etcd and ZooKeeper agree on cluster metadata
  • Distributed locks need agreement on who holds the lock

The catch

You will see in the FLP result that termination cannot be guaranteed in a purely asynchronous network with even one failure, which is why practical systems lean on timeouts.

Key idea

Consensus is agreement on one value despite failures, defined by agreement validity termination and integrity, and it powers replicated logs locks and configuration.

Check yourself

Answer to earn rating on the learn ladder.

1. Which property says no two nodes decide different values?

2. Consensus most directly powers which mechanism?