Microservice Musings : Transactions

Legacy architectures usually use an ACID distributed transaction strategy. This has served well for many years but performs poorly in the highly-distributed environments that are typical of a microservice architecture. With appropriate safeguards in place the ACID guarantees can be relaxed. More radically, distributed transactions can be avoided entirely using an asynchronous Event Messaging strategy.

The strengths, weaknesses and implications of each of these strategies are explored below.

ACID Transactions

To be ACID compliant a system must be:

Atomic
When multiple entities are modified within a transaction either all modifications are committed or all are rolled back
Strongly Consistent
After entities are modified in a transactional context, subsequent reads always return the modified values
Isolated
Transactions are applied in time order, i.e. serialized
Durable
The result of a transaction must persist unconditionally once it is committed

ACID compliant systems such as a DBMS (Database Management System), a MQs (Message Queuing System) or bespoke application services have many advantages as they are robust, reliable, widely supported and understood.

Microservices regularly and successfully use local ACID compliant systems where a single resource manager manages a transaction. Problems occur when multiple systems are accessed within a transaction as this requires a global transaction manager to coordinate several resource managers to provide distributed ACID support. The coordination required is a significant impediment to scaling beyond a few nodes.

The X/Open Distributed Transaction specification describes the most commonly used distributed ACID model. By converging on a common model heterogeneous resource managers can cooperate with any compliant global transaction manager. Global transaction managers are most commonly implemented by a TPS (Transaction Processing System) with significant supporting infrastructure, such as JEE for Java, .NET for Microsoft Windows and CICS on IBM mainframes, though standalone implementations are available.

The X/Open Distributed Transaction specification mandates the use of the 2PC (2-Phase Commit) protocol which has serious performance and scalability problems when used in highly distributed environments. The verbosity of the protocol impedes horizontal scaling, the locking used to achieve Isolation impedes concurrency and throughput, and blocking occurs when nodes fail.

Other ACID compliant distributed protocols are available including 3PC (3-Phase Commit) and Paxos. Both have benefits but neither resolves the performance and scalability problems encountered in highly distributed environments.

Ultimately the costs incurred to achieve distributed ACID transactionality outweigh its benefits when used in microservice architectures where a high level of horizontal distribution is a desirable and often an essential feature.

ACIDless Transactions

Higher scalability can be achieved by dispensing with the multiphase commit protocol and the global transaction manager which manages it. This results in weaker guarantees than those provided by ACID. These are:

Atomic
A transaction will fully succeed or be fully rolled back
✘ Strongly Consistent
Given a transaction consisting of Operation1 and Operation2, state changes made by Operation1 are immediately committed meaning that a concurrent read after Operation1 prior to Operation2 will be inconsistent. This can be guarded against by flagging affected entities as uncommitted during the course of a transaction.
Isolated
In most deployments, collisions are rare. Rather than locking, an OCC (Optimistic Concurrency Control) strategy is used to detect stale writes and roll back on the rare occasions when they do occur.
Durable
Transaction boundaries and operations are recorded in a durable logging service, such as that provided by Apache Bookkeeper - Distributed Log or Apache Kafka if already deployed. The logging service must support high volumes of concurrent writes with low latency to minimise its impact. Log entries are used to replay operations in the event of failures and can be used to rollback aborted transactions as discussed below.

Rollback is achieved by invoking counterpart operations that undo the work of the committed operations. For example, given a sequence of operations of reserve order items, debit payment and dispatch order, should the dispatch order operation fail, rollback is achieved by invoking the counterpart operations credit payment and release order items. Most likely these operations will already exist as they are a necessary part of the business process. Implementation is greatly simplified when operations are idempotent as described in Microservice Musings : Service Time.

ACIDless transaction behaviour can be implemented by a custom resource manager using the same APIs as an ACID resource manager, thereby avoiding the need to modify existing application code. Providing that an OCC strategy is acceptable, ACIDless transactions are an excellent route to higher performing and more scalable microservices.

Transactionless

A more radical approach is to dispense with distributed transactions entirely. Rather than invoking a sequence of operations within a transaction context, on completion operations record an associated state in the modified entities. Using the Event Message pattern these state changes are asynchronously published via a message broker which are consumed and acted on by interested observers.

This approach adds flexibility as the sequence of operations is no longer hard coded as is typical within a transaction. The number of observers for a given state change is unbounded allowing any number of parallel activities. To introduce an additional sequential operation requires a new entity state and adjustment of the observed states. For ultimate flexibility, an inference engine can be interposed to control the sequence of invoked operations using rules or artificial intelligence.

As this strategy is driven by entity state, it's consistency reflects that of the database in which the entities are persisted. How database characteristics are critical to the design, performance and scalability of a solution is explored in Microservice Musings : Databases.