Apache Kafka Fundamentals

Understand topics, partitions, offsets, and the append only log at the heart of Kafka.

The core model

Kafka is a distributed append only log. Producers append records to topics, and consumers read them. A topic is split into partitions, each an ordered immutable sequence of records.

Offset: every record in a partition has a monotonically increasing offset, its position in the log.
Retention: records stay for a configured time or size, so consumers can reread history.
Brokers: servers that host partitions, with each partition replicated across brokers for durability.

Why partitions matter

Parallelism: more partitions let more consumers read in parallel.
Ordering scope: order is guaranteed only within a partition, not across the whole topic.
Key routing: records with the same key go to the same partition, keeping per key order.

Durability

Each partition has a leader and followers. Writes go to the leader and replicate to followers.
A write is acknowledged once enough in sync replicas have it, controlled by the acks setting.

Consumers track their own position

Unlike a queue that deletes on read, Kafka keeps records and each consumer group stores its own offset, so different readers progress independently and can replay.

Key idea