← Lessons

quiz vs the machine

Gold1400

System Design

The Distributed File System

Spreading one file namespace across many machines for scale and fault tolerance.

6 min read · core · beat Gold to climb

Files spread across a cluster

A distributed file system presents one logical namespace while storing data across many servers. It must handle machines that fail, networks that drop, and files far larger than any single disk.

Splitting into chunks

Large files are cut into fixed size chunks, often tens or hundreds of megabytes. Each chunk is replicated to several data nodes. Big chunks reduce the bookkeeping the system must track and favor streaming throughput over tiny random reads.

The metadata server

A central metadata server, sometimes called a master or name node, maps file paths to the list of chunks and the nodes holding each replica. Clients ask the metadata server where a chunk lives, then read or write directly to the data nodes. Keeping bulk data off the metadata path stops it from becoming a bottleneck.

Surviving failure

  • The metadata server tracks which replicas are alive through heartbeats.
  • When a node dies, under replicated chunks are copied elsewhere to restore the target count.
  • Writes are acknowledged only once enough replicas are durable.

Key idea

A distributed file system splits files into replicated chunks across data nodes while a metadata server tracks placement, giving one large namespace that survives individual machine failures.

Check yourself

Answer to earn rating on the learn ladder.

1. Why do clients read chunks directly from data nodes instead of through the metadata server?

2. What happens when a data node holding chunks fails?

3. Why are chunks made large?