Coordinating many tasks
A real pipeline has many steps that must run in the right order. An orchestrator like Airflow, Dagster, or Prefect schedules and monitors these steps, modeled as a directed acyclic graph, or DAG.
Why a DAG
- Directed edges show that one task must finish before another starts.
- Acyclic means no cycles, so the graph always has a valid run order and cannot loop forever.
- Tasks with no dependency between them can run in parallel.
What the orchestrator does
- Triggers runs on a schedule or an external event.
- Handles retries and backoff when a task fails.
- Tracks state so you can see which tasks succeeded, failed, or are pending.
- Enforces dependencies so a task starts only after its upstream inputs are ready.
Good practices
- Keep tasks idempotent so retries are safe.
- Make each task do one clear thing for easy debugging.
- Set timeouts and alerts so stuck runs surface quickly.
Key idea
An orchestrator runs pipeline steps as a DAG, enforcing dependency order while handling scheduling retries and monitoring.