What an LSM Tree Is
A log structured merge tree is the write optimized engine behind systems like RocksDB, Cassandra, and LevelDB. Instead of updating pages in place, it only ever appends.
- New writes land in an in memory structure called a memtable.
- When the memtable fills, it is flushed to disk as an immutable sorted file.
- Background compaction merges these files to reclaim space and keep reads fast.
Why Appends Help
Random in place writes force slow disk seeks. By buffering writes in memory and flushing them as large sequential files, an LSM tree turns scattered updates into cheap sequential I O. This makes it excellent for write heavy workloads.
The Cost
A key can exist in several files at once, so a read may have to check the memtable plus multiple disk files. Engines add bloom filters and tiered levels to keep reads acceptable.
Key idea
An LSM tree buffers writes in memory and flushes immutable sorted files, trading extra read work for very fast sequential writes.