← Lessons

quiz vs the machine

Gold1400

System Design

Log Compaction in Kafka

How compaction keeps the latest value per key instead of expiring by time.

5 min read · core · beat Gold to climb

Two ways a log frees space

A normal Kafka topic deletes old segments by time or size. A compacted topic instead keeps, for each key, at least the most recent value and discards older records for that key.

Why compaction is useful

Compaction turns a log into a durable key value snapshot. A new consumer can replay the compacted topic and rebuild the current state of every key without reading the entire history.

How it works

A background cleaner scans segments and removes superseded records, keeping the latest value per key. Records are not reordered, and recent records in the active segment may stay until the next pass.

Deletes use tombstones

To remove a key entirely, a producer writes a tombstone, a record with the key and a null value. After a retention window the cleaner drops the tombstone and the key disappears.

Key idea

Log compaction retains the newest value per key so a compacted topic acts as a replayable snapshot of current state, with tombstones marking deletions.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a compacted topic retain?

2. How is a key deleted from a compacted topic?

3. Why is compaction useful for rebuilding state?