Two Ways to Store Data
Most databases pick one of two on disk structures. A B plus tree keeps data sorted in fixed pages and updates rows in place. An LSM tree (log structured merge tree) buffers writes in memory and flushes them as sorted files that are merged later.
How They Differ
- A B plus tree does a random write for each update, but reads follow a short path from root to leaf.
- An LSM tree turns writes into fast sequential appends, but a read may check several files plus the memory buffer.
- B plus trees suit read heavy workloads and predictable latency.
- LSM trees suit write heavy workloads and compress well.
The Tradeoff
There is no free lunch. LSM trees give high write throughput but pay with read amplification and background compaction work. B plus trees give clean reads but suffer write amplification from rewriting whole pages.
Key idea
B plus trees favor fast in place reads while LSM trees favor fast sequential writes, so the right engine depends on whether your workload reads or writes more.