Why sagas exist
A business action like placing an order spans several services, each with its own database. A single distributed transaction across them is slow and brittle. A saga instead runs the action as a sequence of local transactions, each in one service.
Compensation, not rollback
There is no global rollback. If a later step fails, the saga runs compensating actions that undo the effects of the earlier completed steps.
- Forward steps each commit locally and publish that they are done.
- Compensations are explicit reversing operations, like refunding a charge or releasing a reserved seat.
Two coordination styles
- Choreography: services react to each other events with no central brain, which is simple but hard to trace.
- Orchestration: a coordinator tells each service what to do next and triggers compensations on failure.
What to design carefully
- Idempotency: steps and compensations may retry, so they must be safe to repeat.
- Semantic undo: compensation reverses business effect, not literal state, since the original commit is already visible.
Key idea
A saga replaces a distributed transaction with local commits plus compensating undo steps, trading strict atomicity for availability while requiring carefully designed idempotent compensations.