[email protected]

Errors in Database Systems, Eventual Consistency, and the CAP Theorem

Michael Stonebraker

April 5, 2010

Recently, there has been considerable renewed interest in the CAP theorem for database management system (DBMS) applications that span multiple processing sites. In brief, this theorem states that there are three interesting properties that could be desired by DBMS applications.

User Comments

 (12)

"Unfortunately, I can’t seem to locate that paper, which was Tandem-specific, in any case."

I think you meant: Why Do Computers Stop and What Can Be Done About It?

Which is here: http://www.hpl.hp.com/techreports/tandem/TR-85.7.html

It is really good to have a serious discussion of this topic. I do not, however, agree that CA is better than AP.

I have been convinced that distributed two phase commit with ACID transactions is not the right way forward since 1988. At that time I was responsible for the application API in CICS at the IBM Hursley lab. The problem is very simple: even in a heterogeneous environment the two phase commit takes too long and is too liable to fail.

Although the Open Group (X/Open at the time) managed to create a standard for transction managers talking to resource managers (XA), they never managed to stabilise the standard for transaction manager to transaction manager communication (XA+). We had first hand experience of this in IBM trying to get IMS, CICS and OS/400 to do distributed two phase commits. This failure to standardise means that even if you go for CA rather than AP, you won't find the middleware to help you do it in a homogeneous environment.

Even if you standardise on a single transaction manager which can do distributed two phase commits (an area where there are many Bohr type errors still in the code of most commercial application servers), you will find that the architecture doesn't scale. The bank I am currently consulting with is putting in a new core banking system. A customer update (a new address say) implies updating five other systems in three data centres. The latency of this would be unbearable. Customer updates are master data changes and happen about 1000 times less often than transactional data changes, like making payments. There the problem is different but even worse. A distributed two phase commit keeps a session open to each transaction manager for each transaction until it commits. Using ACID for payment processing for instance would quickly lead to running out of resources in current operating systems.

There is a management argument too. Two phase commit only buys you consistency because, in the case of a crash, you can recover all resources back to a consistent state. However, if you use this as the architecture for consistency, then, in a typical enterprise environment, you might expect each transaction to have sessions to five other transaction managers and ten resource managers (at least a database and a queue for each) across say three data centres. That means coordinating log identifier exchange (an unpleasant feature of distributed two phase commit) across three data centres and fifteen parties. Our experience was that as soon as two data centres were implicated, heuristic commit ruled the day. In other words, practically speaking, distributed two phase commit results in consistency by guess work.

Back in 1988 we found that customers were already voting with their feet. A large rail operator in the US was distributing train schedule updates (a safety critical transaction) using messages rather than CICS' built in two phase commit because they found it more reliable. This was a large part of the reason we decided to implement transactional MQ - it is a much better way to distribute transactions.

The reasons for many of us preferring AP to CA have nothing to do with whether we prefer NoSQL to SQL but everything to do with the inability of the vendors to scale distributed two phase commit in real world environments. Jim Gray knew this very well. He tried hard to create a nested transaction semantic that vendors could implement, but all of us went to messaging instead. He admitted part of this to the Register in one of his last interviews (http://www.theregister.co.uk/2006/05/30/jim_gray/) where he says:

"Frankly, over 25 years there's been a lot of work in this area [when to use transactions]and not much success. Workflow systems have, by and large, come to the conclusion that what the accountants told us when we started was correct - the best thing you can do is have compensation if you run a transaction and something goes wrong. You really can't undo the transaction, the only thing you can do is run a new transaction that reverses the effects of that previous transaction as best you can."