← Lessons

quiz vs the machine

Platinum1760

Databases

The Write Amplification

Write amplification measures how many bytes the storage engine actually writes for each byte the application asked it to store.

5 min read · advanced · beat Platinum to climb

What Write Amplification Is

Write amplification is the ratio of physical bytes written to disk against logical bytes written by the application. A ratio of ten means storing one megabyte caused ten megabytes of actual disk writes.

Where It Comes From

  • LSM trees rewrite data every time compaction merges a key into a higher level. A row can be rewritten many times over its life.
  • B trees write a full page even for a small row change, and the write ahead log doubles each write.
  • Flash storage adds its own amplification because erases happen in large blocks.

Why It Matters

High write amplification burns disk bandwidth and wears out solid state drives faster, since flash cells tolerate a limited number of erase cycles. On write heavy systems it can become the real bottleneck, not the application throughput.

The Tradeoff

  • Leveled compaction lowers read and space amplification but raises write amplification.
  • Size tiered compaction lowers write amplification but raises space and read amplification.

There is no free lunch. Tuning a storage engine is largely about choosing which amplification to pay.

Key idea

Write amplification is the multiple of extra bytes written per logical write, driven by compaction, logging, and flash, and it trades off against read and space cost.

Check yourself

Answer to earn rating on the learn ladder.

1. What does write amplification measure?

2. Why is high write amplification a problem on solid state drives?

3. Which compaction strategy tends to raise write amplification?