← Lessons

quiz vs the machine

Gold1390

Databases

The Page Compression

Compressing pages shrinks storage and disk traffic at the cost of CPU, and the chosen block size shapes the balance.

4 min read · core · beat Gold to climb

Trading CPU for Input Output

Storage engines can store pages compressed on disk. A page is compressed before it is written and decompressed after it is read. This shrinks the file and cuts how many bytes the disk must move, paying with extra CPU work.

Block Versus Page

Engines compress a unit of data, sometimes a single page and sometimes a larger block of several pages.

  • A larger compression block usually packs better because the algorithm sees more redundancy.
  • But reading one row may force decompressing a whole large block, wasting work.
  • A smaller block decompresses less per read but compresses less tightly.

Why It Helps LSM and B Trees

In an LSM tree, immutable SSTables are a natural fit because they are written once and read many times, so paying to compress once is cheap. In a B tree, compression must cope with in place updates, which is harder because a page may grow when rewritten.

Tuning

Choosing a faster algorithm favors CPU bound systems, while a stronger algorithm favors storage bound ones. The right choice depends on whether the bottleneck is the processor or the disk.

Key idea

Page compression trades CPU for smaller files and less disk traffic, with block size and algorithm choice balancing read overhead against compression ratio.

Check yourself

Answer to earn rating on the learn ladder.

1. What does page compression trade away to save storage?

2. Why can a larger compression block hurt a small read?