Consistency patterns are solutions to the problem that a system must represent views and logic on consistent underlying data, but that that same data, while in the process of being changed, can temporarily be inconsistent. Most of us know that ACID and BASE are ‘opposites’, at least in the chemical world. In the IT world they are mechanisms with similar goals, but they achieve their goal in a completely opposite (or at least different way) with potentially different end results. It is vital that they not be confused with each other as they are distinctly different. ACID and BASE can be used complementary, comparable to the chemical world. ACID and BASE are used in the centrally coordinated aggregation of activities where each is intended to implement a specific way of consistency-control. Composition typically uses transaction commit and rollback (ACID) and orchestration typically uses compensation activities (BASE).
Regarding the problem that ACID and BASE try to solve, both mechanisms are intended to make sure the end result of an operation leaves a system in a consistent state. However, the way they do this is radically different and the end result of the application of either approach can be different as well.
Which one of the two is needed, can depend on your needs, or better, your business' needs. The default response of almost everyone at business level if asked which would be more suitable is "I need ACID". Very rarely do people choose BASE even after it being suggested to them. Most people’s reaction was to see the full monty regarding consistency; no-one was willing to do any trade-offs to free up system resources, even if this meant purchasing bigger and more powerful hardware resources. It seems like transactional integrity was more important than common sense, especially if problematic inconsistency of data can be detected and resolved.
The differences for a person applying either approach are concentrated around the way they work and the amount of effort required for implementation (design-time). The amount of effort for systems to execute (runtime) is very different. There is a significant difference in the amount of claimed resources, directly affecting scalability. Most modern middleware has support for ACID-style transactions, like commit and rollback, making life for designers a lot easier. Orchestration notation standards allow for a standardized approach for designing BASE compensation activities. BASE can however also be applied outside the scope of orchestration. Transactions are most commonly associated with fairly contained and short-lived activities (i.e. single service calls or perhaps on modern platforms also service compositions) whereas compensation is more likely associated with orchestration.
ACID - Atomic Consistent Isolated Durable
A transaction is a centrally coordinated aggregation of activities with specific characteristics enforcing the consistency of the data before, after and while it is being changed. Only the participants of the transaction can see temporarily inconsistent data as the act of changing data takes time. Software components that do not participate in transactions will see the ‘old’ unchanged data of the transaction until the transaction is committed.
To provide context, the concept of commit and rollback must be explained a bit better. If services participate in transactions (i.e. Are controlled by a transaction manager implementing the WS-AtomicTransaction specification), the services vote for the outcome of the transaction. Each service provides – when its’ core service logic is executed – a vote whether it successfully executed the core service logic or not. If the logic successfully executed, the vote would be "commit" and if it failed, the vote would be "rollback".
The transaction manager collects all votes and would only allow the entire transaction to either succeed or fail. As a consequence for the outcome of a transaction, only two end results are allowed: either all participants are allowed to commit their changes to make them persistent, or all are directed to abort their changes to effectively undo all the performed work since the transaction started.
The transaction can only succeed if all participants voted for the completion of the transaction. In this scenario, the transaction manager would send a signal to the participating services to commit their work. In the case one or more of the participants voted against the transaction completion (abort), the transaction would be fully rolled back, including the work of services that had voted for the transaction to succeed. In this scenario the transaction manager would send a signal to the participants to fully undo their work.
Figure 1 – A successful transaction performs commit (top) and a failed transaction performs rollback (bottom) – Transactions can either be committed, or rolled back. Either way the transaction controller receives the participants' votes, decides what the transaction outcome will be and calls the appropriate operations to either complete or rollback the transaction.
Generally speaking as long as a transaction is not committed or rolled back, it can occupy and lock significant amounts of resources like memory and disk space. This is because the participants in a transaction would be working on a set of data, while other parts of the system that do not participate in the transaction, should not be aware of the changes in progress.
To achieve this the system often keeps a read-only copy of the old data to make sure that logic outside the scope of the transaction can see the old version of the data and ensures it can revert to exactly the same state as when the transaction started. While in progress, the total system resources consumed by the transaction keep the overall solution from being scalable, as bigger resource claims are made that cause less concurrent operations to fit on the same machine. The bigger the volume of resources (simplified: size of occupied resources per transaction times amount of concurrently running transactions) a transaction comprises the less scalable the system becomes.
Figure 2 – The diagram shows that participants of a transaction can see the data inside the transaction being changed, where non-participants see a copy of previous state.
Now coming to the acronym ACID, the characteristics of a transaction can briefly be described as:
Atomic: non-participants view the transaction as one single activity
Consistent: before, during and after the transaction, non-participants see a consistent state
Isolated: the participants can make their collective changes in full isolation of non-participants, effectively making sure that the non-participants never see any intermediate state
Durable: once the transaction is committed it stays in that state even in case of power loss. If the system fails intermediately, the previous durable state will be visible for all again
The collective application of these characteristics allows transactions to behave the way they do: allowing participants to make changes while the system is always in a consistent state, where participants can control the changes and non-participants see copies of the previous state. Once a transaction is committed, the read-consistent view of the pre-transaction state is discarded and non-participants see the newly committed state as if it were changed instantaneously. Note that other transaction styles exist and only one fairly common transaction-style is described here.
The concept of WS-AtomicTransaction is a platform extension of the middleware platform (i.e. an enterprise service bus) offering simulated distributed transactions spanning across multiple systems. WS-AtomicTransaction allows the simulation of ACID-like transactions across Web Service boundaries. Although the approach is similar to the ACID approach one distinct difference exists: instead of creating one big transaction context, the system breaks up the transaction into smaller, still centrally coordinated transaction contexts. Any participating services will eventually be entirely committed, or fully rolled back. If the central coordination mechanisms are slow, the intermediate state might be visible for brief periods of time.
Figure 3 – A WS-AtomicTransaction simulates an ACID-style transaction across service boundaries. Each service on its own platform or virtual machine will create a local transaction with the local transaction manager of the platform. Transaction state between (web service) transaction manager and local transaction manager is synchronized using messages. To illustrate that the participants do not necessarily run in the same virtual machine or even on the same platform, one of the participants has a different color.
Similar to 'regular' ACID-style transactions, WS-AtomicTransactions can still hold on to significant amounts of read-consistent data so even if the transaction is just providing a simulated transactional behavior which is not truly atomic at all -- but still has isolation consequences the scalability concern remains an important consideration.
BASE - Basically Available, Soft State, Eventually Consistent
In a BASE approach, the need to commit and the ability to rollback do not exist. If things happen according to plan, they are done. There is no need to confirm when they are done to anyone if we don't want to. This is contrary to when participating in a transaction. There is a downside to the BASE pattern: there is no rollback mechanism. Since this does not exist, the architect/designer is responsible for creating so-called compensation logic or compensation processes/handlers. These are pieces of logic which handle specific exception scenarios. It is easy to make mistakes identifying problematic scenarios and, as a consequence, forget one or two so it requires more than ample planning, designing and reviewing to make sure all crucial scenarios are covered. Sometimes platforms allow for limited support for undo operations or undo process fragments to facilitate the flow of thought. Even if such platform support does not exist, it is a good practice to provide fragments of undo logic for crucial parts in the system, and execute these where necessary
Business activities are unlike transactions because they are long-lived complex activities. The duration of a business activity can be minutes, hours, days or even weeks. During this time span, numerous participants can be involved. The BASE approach is most commonly associated with business activities.
The business activity coordinator (the implemented consistency control mechanism of this pattern) must manage its own state and progress in the process. This means that it must track what it did, and based on when and where things went wrong, explicitly execute compensation activities for undoing the parts of the work which did go OK until that moment. There is however one big advantage: because the system generally does not need to keep track of the process' overall comprised state, but only state that is relevant to the execution of the process, less resources are consumed, which would have a very positive influence on the system's scalability and potentially its performance. Sometimes the place in the process where a failure happened is implicitly the discriminating factor for which compensation logic must be executed. This is at least true for platforms which do not support coordination frameworks like WS-BusinessActivity.
As the architect or designer can choose to selectively or strategically apply checkpoint logic, potentially less compensation logic is executed. This allows a service designer to focus more on the actual core service logic blue sky scenarios and the scope of compensation logic can be reduced to a minimum since the opportunity does arise to compensate until the previous checkpoint versus compensating the longer full process with all related complexity.
This is possible because the architect is in full control of the "when and how" of any applied core logic as well as supporting compensation logic. Since no isolated consistency control exists in the system, temporary inconsistency is an almost-certain consequence. It is not only a consequence; it is the actual foundation of why this pattern is so much more scalable. Temporary inconsistency is allowed in a BASE approach to make system resources more manageable. The information in the system is "Basically Available".
The fact that data can be viewed while in the process of being changed is what we refer to as "Soft state". If compensation logic is to be executed, any intermediate inconsistency in the state may be visible to participants of the orchestration, as well as to outside consumers.
Eventually, when all service logic is executed, the system is left in a consistent state, also referred to as "Eventually Consistent". Presumably, as the core service logic executes relatively quickly, the amount of occasions where the service logic is required but only available in an inconsistent state should be manageable.
As referenced several times now, for the BASE approach the amount of system resources occupied are less than when compared to the ACID approach. The reason is that contrary to the ACID approach, the BASE approach does not keep copies of previous system states in memory. The eventual consistent state of the latter approach does bring the system into a consistent state, however not necessarily into the original state. This is fine as there is no need to always bring the system back into the original state, as long as the end state is a consistent state and the overall system logic is designed with this in mind.
WS-BusinessActivity is a cross-service coordination approach which implements the BASE approach. Not all platforms support this approach but might offer something similar like "undo" operations. As WS-BPEL and BPMN 2.0 both offer support for this it is fairly widely accepted on most modern orchestration platforms.
The below table describes how consistent the data can be perceived by each approach.
Table 1 – Consistency compared for each consistency control mechanism.
Consistency Patterns and Scalability
For understanding how transactions are less scalable than compensation logic, a number of dynamics must be understood. It is important to understand that two types of state are relevant here: the state related to the coordination framework and the state related to consistency of the managed resources.
Table 2 – State management for each consistency control mechanism shows a significant difference between either approach
As ACID controlled resources are generally larger sizes of data, the most straightforward way of scaling is vertical - purchasing more memory or disk space (aka larger machines). As there is a limit to the physical size of a machine you can only get them ‘so’ big so this is not a very future-proof approach. Also, when increasing hardware capacity this way, quite often a certain amount of capacity is wasted to make room for the new hardware (i.e. by replacing the current machine with the next bigger model, or by replacing smaller memory modules or disks by bigger ones). Then what happens to the old hardware?
Horizontal scaling, also known as deploying more machines side by side to process more data, is far more extensible and typically perceived as being cheaper, as you can always add capacity without wasting the capacity you already have, and you can standardize the machines you use.
BASE resources are typically a lot smaller and inherently use less system resources; only the coordination state is relevant, resulting in an overall smaller system or smaller collection of nodes required to reach the necessary system capacity.
In an ESB, it is because of the nature of the comprised distributed service capabilities and the availability of supported standards (or better the lack of mature support by platforms) that it is extremely difficult to reach transactional ‘integrity' across service boundaries. The ACID approach is difficult to implement, if not impossible with certain platforms. The BASE approach is, especially when crossing service boundaries, more easily implementable. Whether or not a coordination framework is available to facilitate BASE managed activities depends on the platform.
Task services are intended to solve a short-lived coordination concern and allow for cross-service transactions. Orchestrated task services are intended to solve fairly long-lived concerns and are inherently not suitable for running transactions as these would occupy resources for extended periods of time. The reason is that while transactions are in progress, more memory or disk space is consumed and potentially data (as in records or tables) is locked. When large amounts of data is being locked for extended periods of time, this can result in an unusable system.
If we look at task services and certainly orchestrated task services, the BASE approach is the encouraged approach from a resource-allocation point of view. The ACID approach can be done if it is really required in certain critical situations that require an extremely low margin of error. This is why it's such a good idea to offer the BASE approach more often than the ACID approach. Offering BASE is easier if it's known what the accepted margin of error is by the business or project principal. This can be discovered during the requirements clarification phase, a very important phase of every service delivery effort. Without proper requirements and business process investigations (discovery, clarification/elicitation), it is virtually impossible to see whether you really need ACID or can live with BASE.
One thing that is important to remember however is that in one way or another transactions are more expensive than compensation. A concern is that often the up-front amount of saved resources are hard to quantify and having an argument with the project or business principal about saved cost due to the BASE approach instead of ACID might not be the right way to proceed. Sometimes it’s worth the investment to offer a small preview (implement both) and simulate consequences in a test environment to show the difference.
Conclusion: What Happens When We Mix ACID and BASE?
Business activities and atomic transactions can be combined to enhance the overall consistency of a system while guarding the amount of consumed system resources.
To facilitate this, transactions can be encapsulated in the scope of a business activity. This way, long-running activities can be managed BASE-style and short-lived activities can still be managed ACID-style. This way transactions and business activities can be used in a complimentary fashion, achieving a scalable yet still as-consistent-as-possible overall system. Also it must be understood that both blue-sky core service logic as well as compensation logic can be wrapped separately based on their respective execution context and path.
Figure 4 – Atomic Transactions can be encapsulated by Business Activities.
When business activities are combined with transactions this can significantly increase the quality of service of a long-lived process. As such these can be – if combined in a smart way – truly beneficial to your business.