← Lessons

quiz vs the machine

Platinum1780

Databases

Cassandra Compaction Strategies

How SSTables are merged and which strategy fits each workload.

6 min read · advanced · beat Platinum to climb

Why compaction exists

Cassandra writes are append only into immutable SSTables. Over time a partition's data spreads across many SSTables, and deletes leave tombstones. Compaction merges SSTables, drops obsolete rows, and purges expired tombstones.

Size tiered compaction

SizeTieredCompactionStrategy (STCS) groups SSTables of similar size and merges them when enough accumulate.

  • Great for write heavy workloads with low write amplification.
  • Downside is read amplification, since a partition may span many tiers, and it needs free space for large merges.

Leveled compaction

LeveledCompactionStrategy (LCS) organizes SSTables into levels where each level is roughly ten times the previous.

  • A partition lives in few SSTables per level, so reads touch fewer files.
  • Best for read heavy and update heavy workloads, at the cost of more write amplification.

Time window compaction

TimeWindowCompactionStrategy (TWCS) buckets SSTables by time window and compacts only within a window.

  • Ideal for time series with TTL, since whole old windows can be dropped cheaply.

Diagram

Key idea

Pick compaction by workload: STCS for writes, LCS for reads and updates, and TWCS for time series data with a TTL.

Check yourself

Answer to earn rating on the learn ladder.

1. Which strategy minimizes read amplification for read heavy workloads?

2. Why is TWCS ideal for time series with a TTL?