← Lessons

quiz vs the machine

Platinum1820

System Design

The Log Structured Merge Tree Revisited

Buffer writes in memory, flush sorted runs to disk, and merge them in the background.

6 min read · advanced · beat Platinum to climb

Designed for Write Heavy Workloads

A B tree updates data in place, which means random disk writes. A log structured merge tree, or LSM tree, instead turns writes into sequential appends and defers the sorting work, trading some read cost for very high write throughput.

How It Works

  • New writes land in an in memory memtable, usually a sorted structure, and in a write ahead log for durability.
  • When the memtable fills, it is flushed to disk as an immutable sorted file called an SSTable.
  • Over time many SSTables accumulate, so a background compaction merges them, discarding overwritten and deleted keys.

The Tradeoffs

A read may have to check the memtable plus several SSTables, so each level keeps a bloom filter to skip files that cannot hold the key. The cost is write amplification: data is rewritten each time it is compacted. Tuning compaction balances read speed, write amplification, and space used.

Key idea

An LSM tree buffers writes in a memtable, flushes immutable sorted SSTables, and compacts them in the background, trading write amplification and multi file reads for high write throughput.

Check yourself

Answer to earn rating on the learn ladder.

1. Where do new writes first land in an LSM tree?

2. What does background compaction accomplish?

3. Why does each SSTable level keep a bloom filter?