Circuit Breaker
A resilience pattern that stops calling a failing service, giving it time to recover before retrying.
A circuit breaker detects repeated failures when calling a downstream service and stops sending requests, giving that service time to recover. It prevents cascading failures where one broken component takes down the rest of the system.
The pattern works like an electrical circuit breaker. It has three states:
- Closed: Everything works normally. Requests flow through to the downstream service. The breaker tracks recent failures.
- Open: Too many failures happened. The breaker rejects all requests immediately without calling the service. This avoids piling more load on something that's already struggling.
- Half-open: After a cooldown period, the breaker lets a few requests through to test if the service has recovered. If they succeed, it returns to closed. If they fail, it goes back to open.
In event-driven systems, this pattern is especially useful when message subscribers process a backlog of queued messages. If the subscriber depends on an external service (like a database or API) that went down, a flood of retried messages can overload it the moment it comes back. The circuit breaker stops processing messages early, so the downstream service isn't hammered while it's recovering.
The difference between a circuit breaker and throttling is when they act. Throttling is preventive: you set rate limits before problems occur. A circuit breaker is reactive: it kicks in after detecting failures, regardless of their cause.
One thing to keep in mind: while the circuit is open, messages accumulate in the queue. Make sure your Dead Letter Queue strategy accounts for this. Messages that pile up are not lost, but they will be processed with a delay once the circuit closes again.
References
- Watermill 1.3 released, an open-source event-driven Go library — Introduces the CircuitBreaker middleware implementing the circuit breaker pattern with gobreaker. Like the Throttle middleware, it helps avoid overloading downstream services by failing fast when handlers fail repeatedly.