Distributed Deadlock

What it is

A distributed deadlock occurs when processes on different nodes each hold a resource and wait for one held by another, forming a cycle of waiting that never resolves. It is the classic deadlock, but spread across the network where no single node sees the whole picture.

The four conditions

Deadlock needs all of these, the same as in a single machine:

Mutual exclusion resources held exclusively
Hold and wait a process holds one resource while waiting for another
No preemption resources are not forcibly taken
Circular wait a cycle exists in the wait for graph

Detection across nodes

No node has the full wait for graph, so detection is harder:

Edge chasing sends probe messages along wait edges; if a probe returns to its origin, a cycle exists
Centralized detection builds a global graph at a coordinator, risking a bottleneck and false cycles from stale data

Prevention and recovery

Resource ordering so cycles cannot form
Timeouts that abort a waiting transaction
Wait die or wound wait schemes using timestamps to decide who aborts

Key idea