← Lessons

quiz vs the machine

Gold1400

System Design

Chunked File Storage

Split a file into fixed or variable sized chunks stored independently and reassembled on read.

5 min read · core · beat Gold to climb

The idea

Rather than storing a file as one monolithic blob, chunked storage breaks it into many smaller chunks. A small manifest lists the chunks in order. To read the file the system fetches each chunk and concatenates them.

Why chunk

  • Parallel transfer: chunks move independently across many connections or nodes.
  • Targeted repair: a corrupt chunk is refetched alone.
  • Sharing: identical chunks across files can be stored once, which enables deduplication.
  • Resumability: an interrupted transfer continues from the next missing chunk.

Fixed versus variable chunking

Fixed size chunking is simple: cut every N bytes. But inserting one byte near the front shifts every later boundary, so nothing matches anymore. Content defined chunking sets boundaries from the data itself using a rolling hash, so a local edit only changes the chunks it touches, preserving sharing.

Key idea

Chunked storage splits a file into independently stored chunks tracked by a manifest, enabling parallel transfer, targeted repair, sharing, and resumable uploads.

Check yourself

Answer to earn rating on the learn ladder.

1. What tracks the order of a file's chunks?

2. Why does content defined chunking beat fixed size chunking for sharing?