The Distributed File System

Spreading one file namespace across many machines for scale and fault tolerance.

Files spread across a cluster

A distributed file system presents one logical namespace while storing data across many servers. It must handle machines that fail, networks that drop, and files far larger than any single disk.

Splitting into chunks

Large files are cut into fixed size chunks, often tens or hundreds of megabytes. Each chunk is replicated to several data nodes. Big chunks reduce the bookkeeping the system must track and favor streaming throughput over tiny random reads.

The metadata server

A central metadata server, sometimes called a master or name node, maps file paths to the list of chunks and the nodes holding each replica. Clients ask the metadata server where a chunk lives, then read or write directly to the data nodes. Keeping bulk data off the metadata path stops it from becoming a bottleneck.

Surviving failure

The metadata server tracks which replicas are alive through heartbeats.
When a node dies, under replicated chunks are copied elsewhere to restore the target count.
Writes are acknowledged only once enough replicas are durable.

Key idea

A distributed file system splits files into replicated chunks across data nodes while a metadata server tracks placement, giving one large namespace that survives individual machine failures.

The Distributed File System

Files spread across a cluster

Splitting into chunks

The metadata server

Surviving failure

Key idea

Check yourself