Why replicate
A single broker holding a partition is a single point of failure. Replication keeps copies of each partition on several brokers. One copy is the leader that handles reads and writes; the others are followers that copy the leader.
In sync replicas
The set of followers that are caught up is the in sync replica set or ISR. A write is considered safe once enough ISR members have it. The acks setting controls this: acks all waits for the full ISR, giving the strongest durability.
The durability latency trade off
- acks all: highest durability, more latency since it waits for replicas.
- acks one: only the leader confirms, faster but loses data if the leader dies before followers copy it.
Replication factor
A factor of three tolerates one broker loss while still keeping a quorum. Higher factors cost more storage and network but survive more failures.
Flow
Key idea
Replication keeps partition copies on multiple brokers; the acks setting trades latency for durability based on how many in sync replicas must confirm.