← Lessons

quiz vs the machine

Gold1440

System Design

Distributed Deadlock

When processes across nodes wait on each other forever.

5 min read · core · beat Gold to climb

What it is

A distributed deadlock occurs when processes on different nodes each hold a resource and wait for one held by another, forming a cycle of waiting that never resolves. It is the classic deadlock, but spread across the network where no single node sees the whole picture.

The four conditions

Deadlock needs all of these, the same as in a single machine:

  • Mutual exclusion resources held exclusively
  • Hold and wait a process holds one resource while waiting for another
  • No preemption resources are not forcibly taken
  • Circular wait a cycle exists in the wait for graph

Detection across nodes

No node has the full wait for graph, so detection is harder:

  • Edge chasing sends probe messages along wait edges; if a probe returns to its origin, a cycle exists
  • Centralized detection builds a global graph at a coordinator, risking a bottleneck and false cycles from stale data

Prevention and recovery

  • Resource ordering so cycles cannot form
  • Timeouts that abort a waiting transaction
  • Wait die or wound wait schemes using timestamps to decide who aborts

Key idea

Distributed deadlock is a circular wait spread across nodes that no single node fully sees, detected by edge chasing probes and prevented by resource ordering or timestamp based abort schemes.

Check yourself

Answer to earn rating on the learn ladder.

1. Which is one of the four conditions for deadlock?

2. How does edge chasing detect a distributed deadlock?