Saga
A pattern for managing distributed transactions by coordinating a sequence of local transactions with compensating actions.
A saga coordinates a set of operations across multiple services. If one step fails, it runs compensating actions to undo the previous steps. It's a kind of a distributed transaction.
In a monolith, you can wrap the whole operation in a database transaction. In a distributed system, there is no single transaction that spans multiple services. With the saga pattern, each service performs its own transaction and publishes an event. The next service reacts to that event and does its part. If something fails, compensating events trigger rollbacks in reverse order.
For example: placing an order might involve reserving stock, charging payment, and scheduling shipping. If the payment fails, a compensating action restores the reserved stock.
A word of caution: sagas are powerful, but they introduce significant accidental complexity. Before reaching for a saga, consider whether you actually need to split the operation across services. If the same team owns both sides, a single service with a local transaction is almost always the better choice.
References
- Distributed Transactions in Go: Read Before You Try — Discusses when sagas are appropriate and warns against using them when simpler solutions exist. Explains why distributed transactions are often overkill and how the saga pattern adds complexity.
- Event-Driven Architecture: The Hard Parts — Covers the complexity of sagas and compensating transactions. Discusses what happens when rollbacks fail, why merging services or embracing eventual consistency is often a better choice, and how a training exercise demonstrates simplifying by removing a saga.
- Learning Software Skills fast: what worked for us best in the last 15 years — Mentions the saga pattern as an example of applying patterns in the wrong context, leading to CV-driven development rather than solving real problems.