← Lessons

quiz vs the machine

Platinum1860

System Design

Erasure Coding for Durability

Split data into fragments with parity so a lost subset can be reconstructed cheaply.

6 min read · advanced · beat Platinum to climb

Durability Without Full Copies

Keeping three full replicas of every object survives failures but costs three times the storage. Erasure coding achieves similar or better durability for far less overhead by storing parity fragments instead of whole copies.

How It Works

  • An object is split into k data fragments.
  • An encoder computes m parity fragments from them.
  • All k plus m fragments are spread across different disks or nodes.
  • The original data can be rebuilt from any k of the total fragments, so up to m losses are tolerated.

A common scheme stores ten data and four parity fragments. That survives any four failures while using only about one point four times the raw size, versus three times for triple replication.

The Tradeoffs

Erasure coding shines for large, rarely changing objects. The cost is reconstruction overhead: rebuilding a lost fragment requires reading k fragments and doing math, which raises read latency and network traffic. So hot small data often stays replicated while cold large data is erasure coded.

Key idea

Erasure coding stores k data plus m parity fragments so any k reconstruct the object, giving high durability at low overhead but paying extra read and rebuild cost compared to replication.

Check yourself

Answer to earn rating on the learn ladder.

1. In a k plus m erasure code, how many fragments are needed to reconstruct the object?

2. Why is erasure coding often used for cold large objects rather than hot small ones?

3. How does erasure coding compare to triple replication on storage overhead?