← Lessons

quiz vs the machine

Silver1100

Databases

The LSM Tree Storage Engine

A log structured merge tree turns random writes into fast sequential appends by buffering in memory and merging sorted files later.

4 min read · intro · beat Silver to climb

What an LSM Tree Is

A log structured merge tree is the write optimized engine behind systems like RocksDB, Cassandra, and LevelDB. Instead of updating pages in place, it only ever appends.

  • New writes land in an in memory structure called a memtable.
  • When the memtable fills, it is flushed to disk as an immutable sorted file.
  • Background compaction merges these files to reclaim space and keep reads fast.

Why Appends Help

Random in place writes force slow disk seeks. By buffering writes in memory and flushing them as large sequential files, an LSM tree turns scattered updates into cheap sequential I O. This makes it excellent for write heavy workloads.

The Cost

A key can exist in several files at once, so a read may have to check the memtable plus multiple disk files. Engines add bloom filters and tiered levels to keep reads acceptable.

Key idea

An LSM tree buffers writes in memory and flushes immutable sorted files, trading extra read work for very fast sequential writes.

Check yourself

Answer to earn rating on the learn ladder.

1. Why are LSM tree writes fast?

2. What makes reads harder in an LSM tree?