Since the famous conjecture by Eric Brewer and proof by Nancy Lynch et al., CAP has given the world countless learned discussions about distributed systems and many a well-funded start-up. Yet who truly understands what CAP means? Even a cursory survey of the blogosphere shows profound disagreement about the meaning of terms like CP, AP, and CA in real systems. Those who disagree on CAP include some of the most illustrious personages of the database community.
We can therefore state with some confidence that CAP is confusing. Yet this observation itself raises deeper questions. Is CAP merely confusing? Or is it the case that as with other initially accepted but now doubtful ideas like the Copernican model, evolution, and continental drift, that CAP is actually not correct? Thoughtful readers will agree this question has not received anywhere near the level of scientific scrutiny it deserves.
Fortunately for science private citizens like me have been forging ahead without regard to the opinions of so-called experts or even common sense. My work on CAP relies on two trusted analytic tools of database engineers over the legal drinking age: formal logic and beer. Given the nature of the problem we should obviously use a minimum of the former and a maximum of the latter. We have established that CAP is confusing. To understand why we must now deepen our confusion and study its habits carefully. Other investigators have used this approach with great success.
Let us begin by translating the terms of CAP into the propositional calculus. The terms C (consistency), A (availability) and P (partition tolerance) can be used to state the famous "two out of three" of CAP using logical implication as shown below.
(1) A and P => not C
(2) P and C => not A
(3) C and A => not P
So far so good. We can now dispense briefly with logic and turn to confusion. It seems there is difficulty distinguishing the difference between CA and CP systems, i.e., that they are therefore equivalent. This is a key insight, which we can express formally as follows:
(4) C and A <=> C and P
which further reduces to
(5) A <=> P
In short our confusion has led us directly to the invaluable result that A and P, hence availability and partition tolerance, are exactly equivalent! I am sure you share my excitement at the direction this work is taking. We can now through a trivial substitution of A for P in equation 2 above reveal the following:
(6) A and C => not A
(7) C => (A => not A)
We have just shown that consistency implies that any system that is available is also unavailable simultaneously. This is an obvious contradiction, which means the vast logical edifice on which CAP relies crumbles like a soggy nacho. Considering the amount of beer consumed at the average database conference it is surprising nobody thought of this before.
At this point we can now raise the conversation up a level from looking for spare change under the table and comment on the greater meaning of our results in the real world. Which is the following: Given the way most of us programmers write software it's a wonder CAP is an issue at all. Honestly, I can't even get calendar programs to send invitations to each other across time zones. I plan to bring the combustible analytic capabilities of logic and beer to bear on the mystery of time at a later date. For now we can just speculate it is due to a mistaken design based on CAP.
Hands-On Look at ZFS with MySQL
3 days ago