← Lessons

quiz vs the machine

Gold1430

System Design

The Distributed Scheduler

How a cluster assigns tasks to workers and reassigns them on failure.

5 min read · core · beat Gold to climb

What a scheduler does

A distributed scheduler decides which worker runs which task. It must place work efficiently, react to failures, and avoid running the same task twice.

Core responsibilities

  • Placement: pick a worker with enough capacity and the right constraints.
  • Tracking: record which task is assigned where, in a consistent store.
  • Rescheduling: if a worker fails its heartbeat, move its tasks elsewhere.

Avoiding double execution

When a worker looks dead, the scheduler reassigns its task. But the old worker might be merely slow. To prevent two runs, the scheduler relies on leases or fencing tokens so the resurrected old worker is rejected.

Key idea

A distributed scheduler places tasks on workers, tracks them consistently, and reschedules on failure while using fencing to avoid running a task twice.

Check yourself

Answer to earn rating on the learn ladder.

1. What lets a scheduler safely reschedule a task from a seemingly dead worker?

2. Which is a core responsibility of a distributed scheduler?