Distributed Transactions
Coordinating changes across multiple services so they either all succeed or all get rolled back.
A distributed transaction coordinates changes across multiple services so they either all succeed or all get rolled back. In a single database, this is what BEGIN and COMMIT do. Across services, there's no shared transaction manager, so you need to handle it yourself.
The most common approach is the saga pattern. Instead of one atomic transaction, you break the work into steps. Each step has a compensating action that undoes it if a later step fails. For example: book a flight, then book a hotel. If the hotel booking fails, cancel the flight.
This sounds straightforward, but the complexity adds up fast. You need to handle partial failures, retries, and the fact that compensating actions can fail too. The system can end up in an inconsistent state that you need to manually fix.
Before reaching for distributed transactions, consider if the problem is in your service boundaries. If two services always need to change together, they might belong in the same bounded context. A single SQL transaction is simpler, faster, and more reliable than any distributed coordination.
References
- Distributed Transactions in Go: Read Before You Try — Explains why distributed transactions are often overkill and suggests alternatives like better service boundaries, the outbox pattern, and sagas.
- Database Transactions in Go with Layered Architecture — Covers single-service transactions as the foundation. Distributed transactions come into play when a single database transaction isn't enough.
- How to use basic CQRS in Go — Warns against blindly splitting into microservices. A big transaction in one database is much faster and less complex than a distributed transaction across multiple services.
- Watermill 1.4 Released (Event-Driven Go Library) — Links to examples on distributed transactions where the outbox pattern is used with Watermill.
- Event-Driven Architecture: The Hard Parts — Discusses distributed transactions and sagas in depth. If you feel like you need distributed transactions between three services, maybe you should just merge them into one.
- AMA #1: Clean Architecture, Learning, Event-Driven, Go — Touches on distributed transactions in the context of async communication. When dealing with chains of service calls, consider using messages instead of increasing timeouts.