What PBFT made practical
Practical Byzantine Fault Tolerance, by Castro and Liskov in 1999, was the first BFT protocol fast enough for real systems. It tolerates f faulty nodes out of 3f plus 1 while serving requests with reasonable latency.
The primary and replicas
One node acts as the primary and orders requests; the rest are backups. A request flows through three phases that build agreement and detect a bad primary.
The three phases
- Pre prepare the primary assigns a sequence number and broadcasts the request
- Prepare replicas broadcast that they saw the same assignment; collecting 2f matching messages proves agreement on order
- Commit replicas broadcast commit; collecting 2f plus 1 commits means the request is committed and can execute
The two voting rounds ensure that even with f liars, honest nodes agree on a single ordering.
View changes
If the primary is faulty or slow, replicas trigger a view change to elect a new primary, preserving safety across the switch.
Trade offs
- Strong safety with up to f Byzantine nodes
- Message cost grows quadratically with node count
- Best for small permissioned clusters, not open networks
Key idea
PBFT orders requests through pre prepare prepare and commit phases over 3f plus 1 nodes, using two voting rounds and view changes so honest replicas agree despite f liars.