In the big open world of the cloud, highly available distributed objects will rule.
All Your Database Are Belong To Us
The following letter was published in the Letters to the Editor in the January 2013 CACM (http://cacm.acm.org/magazines/2013/1/158757).
I write to support and expand on Erik Meijer's article "All Your Database Are Belong to Us" (Sept. 2012). Relational databases have been very useful in practice but are increasingly an obstacle to progress due to several limitations:
Inexpressiveness. Relational algebra cannot conveniently express negation or disjunction, much less the generalization/specialization connective required for ontologies;
Inconsistency non-robustness. Inconsistency robustness is information-system performance in the face of continually pervasive inconsistencies, a shift from the once-dominant paradigms of inconsistency denial and inconsistency elimination attempting to sweep inconsistencies under the rug. In practice, it is impossible to meet the requirement of the Relational Model that all information be consistent, but the Relational Model does not process inconsistent information correctly. Attempting to use transactions to remove contradictions from, say, relational medical information is tantamount to a distributed-denial-of-service attack due to the locking required to prevent new inconsistencies even as contradictions are being removed in the presence of interdependencies;
Information loss and lack of provenance. Once information is known, it should be known thereafter. All information stored or derived should have provenance; and
Inadequate performance and modularity. SQL lacks performance because it has parallelism but no concurrency abstraction. Needed are languages based on the Actor Model (http://www.robust11.org) to achieve performance, operational expressiveness, and inconsistency robustness. To promote modularity, a programming language type should be an interface that does not name its implementations contra to SQL, which requires taking dependencies on internals.
There is no practical way to repair the Relational Model to remove these limitations. Information processing and storage in computers should apply inconsistency-robust theories(1) processed using the Actor Model(2) in order to use argumentation about known contradictions using inconsistency-robust reasoning that does not make mistakes due to the assumption of consistency.
This way, expressivity, modularity, robustness, reliability, and performance beyond that of the obsolete Relational Model can be achieved because computing has changed dramatically both in scale and form in the four decades since its development. As a first step, a vibrant community, with its own international scientific society, the International Society for Inconsistency Robustness (http://www.isir.ws), conducted a refereed international symposium at Stanford University in 2011 (http://www.robust11.org); a call for participation is open for the next symposium in the summer of 2014 (http://www.ir14.org).
Palo Alto, CA
1. Hewitt, C. Health information systems technologies. Stanford University CS Colloquium EE380, June 6, 2012; http://HIST.carlhewitt.info
2. Hewitt, C., Meijer, E., and Szyperski, C. The Actor Model. Microsoft Channel 9 Videos; http://channel9.msdn.com/Shows/Going+Deep/Hewitt-Meijer-and-Szyperski-The-Actor-Model-everything-you-wanted-to-know-but-were-afraid-to-ask
The following letter was published in the Letters to the Editor in the December 2012 CACM (http://cacm.acm.org/magazines/2012/12/157873).
Communications readers have a right to expect accuracy. Sadly, accuracy is not always what they get. The article "All Your Database Are Belong to Us" by Erik Meijer (Sept. 2012) contains so many inaccuracies, confusions, and errors regarding "the database world" it is difficult to read coherently. The first paragraphs alone contain more egregious misstatements than most entire articles or papers. For the record: "The raw physical data model" is categorically not "at the center of the [relational database] universe." Queries do not "assume intimate details of the data representation (indexes, statistics, metadata)." While database technology relies on "The Closed World Assumption," this assumption has nothing to do with what the author apparently meant. Every phrase in "Exposing naked data and relying on declarative magic becomes a liability" relies on at least one counterfactual. "Objects should hide their private data representation, exposing it only via well-defined behavioral interfaces." But this is exactly what the relational model doesexcept (unlike OO) it adopts an interface discipline that makes ad hoc query and the like possible. "In the realm of [data] modelers, there is no notion of data abstraction." Astoundingly wrong. "[Database technology necessarily involves] a computational model with a limited set of operations." False. Although the (very powerful, well-defined, provably correct) required set of relational operations is small, the sky's the limit on derived relational operations or operations that define abstract data type/domain behavior.
The author's unfounded antipathy toward relational databases dominates even his application of CAP: "The problem with SQL databases... is the assumption that the data... meets a bunch of consistency constraints that is difficult to maintain in an open ['anything goes'?] distributed world." CAP does not eliminate this requirement; "...the hidden cost of forfeiting [system-enforced] consistency... is the need [for the programmer] to know the system's invariants."(1) Nor can programmers "...design their systems to be robust...to inconsistency." Once data inconsistency invades a computationally complete system, it is not even, in general, detectable, and all bets are off. Consistency must be enforced, hence constraints. The author seemed to equate detecting abnormal execution with enforcing logical data consistency. No wonder confusion abounds; CAP consistency is single-copy consistency, a subset of what ACID databases provide, yet the Gilbert/Lynch CAP proof relies on linearizability, a more stringent requirement than the serializability ACID databases need or use.
And so on... Deconstructing the entire article properly would take more time than we care to devote, but the foregoing should suffice to demonstrate its fallaciousness. We hope the author is not teaching these confusions, errors, logical inconsistencies, and fallacies.
It is difficult even to believe the article was peer reviewed. Indeed, it is truly distressing it did not demonstrate even minimal understanding of one of the most important contributions to computing: the relational model. We can only deplore Communications's role in promulgating such a lack of understanding.
Boulder Creek, CA
1. Brewer, E. CAP twelve years later: How the 'rules' have changed. IEEE Computer 45, 2 (Feb. 2012), 2329.
The purpose of the article was not to criticize the relational model but to point out how building industrial-strength systems using today's relational database systems requires leaving the ivory tower and dealing with a morass of ad hoc extensions to the clean mathematical basis of first-order predicate logic. Rather than depend on pure sets and relations, developers need to think in terms of (un)ordered multisets. For the sake of efficiency and lock-contention avoidance, transactions allow for various isolation levels that clearly violate the ACID guarantees of Platonic transactions. The article also considered whether in the new world of the Cloud we should view as complementary computational models that fundamentally address loosely coupled distributed systems, like Carl Hewitt's Actors.
Erik Meijer, Delft
Displaying all 2 comments