The idea
Rather than storing a file as one monolithic blob, chunked storage breaks it into many smaller chunks. A small manifest lists the chunks in order. To read the file the system fetches each chunk and concatenates them.
Why chunk
- Parallel transfer: chunks move independently across many connections or nodes.
- Targeted repair: a corrupt chunk is refetched alone.
- Sharing: identical chunks across files can be stored once, which enables deduplication.
- Resumability: an interrupted transfer continues from the next missing chunk.
Fixed versus variable chunking
Fixed size chunking is simple: cut every N bytes. But inserting one byte near the front shifts every later boundary, so nothing matches anymore. Content defined chunking sets boundaries from the data itself using a rolling hash, so a local edit only changes the chunks it touches, preserving sharing.
Key idea
Chunked storage splits a file into independently stored chunks tracked by a manifest, enabling parallel transfer, targeted repair, sharing, and resumable uploads.