Two retention models
Time based retention keeps everything for a window then deletes by age. Log compaction is different: it keeps the latest message for each key and removes older values for the same key. The log becomes a snapshot of the most recent state per key.
Why it is useful
Compaction suits changelog style data where you only care about the current value, such as the latest profile for each user id. A new consumer can rebuild full current state by replaying the compacted log, without storing every historical update.
How tombstones delete
To remove a key entirely, a producer writes a tombstone, a message with the key and a null value. Compaction keeps the tombstone long enough for all consumers to see the deletion, then eventually purges it.
The trade off
Compaction loses history: you can no longer see the full sequence of changes, only the final value per key. It also runs as a background process, so very recent duplicates may still be present until the next compaction pass.
Flow
Key idea
Compaction retains only the latest value per key, turning a log into a current state snapshot at the cost of losing historical change sequences.