Wednesday, October 29, 2014

An Ending and a Beginning: VMware Has Acquired Continuent

As of today, Continuent is part of VMware. We are absolutely over the moon about it.

You can read more about the news on the VMware vCloud blog by Ajay Patel, our new boss. There’s also an official post on our Continuent company blog. In a nutshell the Continuent team is joining the VMware Cloud Services Division. We will continue to improve, sell, and support our Tungsten products and work on innovative integration into VMware’s product line.

So why do I feel exhilarated about joining VMware? There are three reasons. 

1.     Continuent is joining a world-class company that is the leader in virtualization and cloud infrastructure solutions. Even better, VMware understands the value of data to businesses. They share our vision of managing an integrated fabric of standard DBMS platforms, both in public clouds as well as in local data centers. It is a great home to advance our work for many years to come.

2.     We can continue to support our existing users and make Tungsten even better. I know many of you have made big decisions to adopt Continuent technology that would affect your careers if they turned out badly. We now have more resources and a mandate to grow our product line. We will be able to uphold our commitments to you and your businesses.

3.     It’s a great outcome for our team, which has worked for many years to make Continuent Tungsten technology successful. This includes our investors at Aura in Helsinki, who have been dogged in their support throughout our journey.

Speaking of the Continuent team…I am so proud of what all of you have achieved. Today we are starting a new chapter in our work together. See you at VMware!

Monday, October 6, 2014

Exorcising the CAP Demon

Computer science is like an enormous tool box you can rummage through whenever you have a problem to solve. Most of the tools are sturdy and practical, like algorithms for B-trees. Some are also elegant, like consistent hashing in Dynamo. Finally there are some tools that you never quite figure out even after years of reflection. That piece of steel you are looking at could be Excalibur. Or it could be a rusty knife.

The CAP theorem falls into the last category, at least for me.  It was a major topic in the blogosphere a few years ago and Google Trends shows steadily increasing interest in the term since 2010.  It's not my goal to explain CAP fully--a good informal description is here or you can just read the proof yourself.  Instead I would like to talk about how I understand and use the CAP theorem today as well as how that understanding might evolve in the future.

In a nutshell CAP puts a limit on how distributed database systems trade off data consistency and system availability.   Eric Brewer originated the theorem as a conjecture in the late 1990s. Seth Gilbert and Nancy Lynch supplied a proof of the conjecture in 2002.  Brewer described it as follows in 2012:
The CAP theorem states that any networked shared-data system can have at most two of three desirable properties:
  • consistency (C) equivalent to having a single up-to-date copy of the data;
  • high availability (A) of that data (for updates); and
  • tolerance to network partitions (P).
My initial problem in understanding CAP was relating the proof to what happens in the real world, which is not especially easy. Network partitions are an example.  Here's how the Gilbert/Lynch proof defines them in Section 2.3.
When a network is partitioned, all messages sent from nodes in one component of the partition to nodes in another component are lost. (And any pattern of message loss can be modeled as a temporary partition separating the communicating nodes at the exact instant the message is lost.)
So does this include an asymmetric communication failure? That's where a process on one host can see and send messages to a process on another host but the reverse is not true. This happens all the time in group communications for reasons that range from application software bugs to bad cabling and everything in between. Do you model the asymmetry as a sequence of temporary partitions? It's of course possible. But it feels a bit like using Ptolemaic astronomy with epicycles

Other people have made similar observations. Eric Brewer even wrote about the "nuances" of partitions in his 2012 retrospective. There are analogous problems with the other terms. There was enough public disagreement their meaning that I wrote a "disproof" of CAP a few years back as an April Fools Day joke. It depended on not being able to distinguish CA and CP choices in real systems. 

That confusion is not a problem with the CAP theorem itself. Nobody has seriously challenged the proof. Instead, it's a matter of what logicians refer to as interpretation, which links a logical model to some domain of discourse so that you can draw valid conclusions about that domain. If you want to reason about real-world systems using the CAP theorem you must first ensure your systems really match the model. Otherwise it's like using a map of Oregon to drive between New York and Boston.  The core difficulty is that the CAP theorem proof assumes binary properties whereas in reality properties like availability operate on a sliding scale. 

My other issue with CAP evaluation is what you might call a suitability problem. There are a lot of issues with operating distributed systems, and the 3-way trade-off is irrelevant to many of them. For instance, what happens when the network is behaving and you don't have to make pesky choices between availability and consistency? Let's look at some examples. 

CAP defines consistency as linearizability, which means that transactions on different replicas look as if they all happened at once in a single place in a single unbroken series. Imagine driving around to different automated teller machines at a bank and making changes to your account balance or checking it. No matter which teller machine you visit next, it knows exactly what happened before and has the right balance amount. Or imagine a shopping cart on a website like No matter how you jump around the website to select clothing or even if you fold up your laptop and fly to Paris, the items in your shopping care remain consistent without duplicate or missing selections. 

You might say, well, not all systems work that way.  You would be right, and that's the exactly the point.  Real distributed systems do not always try to ensure linearizability. It turns out that many people, most particularly end users who ultimately pay for computer systems, conclude they don't really care so much about consistency of the sort CAP promises.  Here are two different types of reasons: 

1. Linearized consistency is expensive. Keeping active replicas up to date requires round trip messages between hosts, which can reduce transaction commit times by an order of magnitude or more. Users are allergic to slow response, regardless of any other benefits that slowness might bring.  Daniel Abadi pointed out this latency problem some time ago in a great blog post on CAP that is still excellent reading today.  

2. Linearized consistency is irrelevant for many applications. Consider a measurement from a household thermometer or a text message from a cell phone.  There is only one of each generated in a single location. Your servers either get them or they don't.  Multiple copies are just that: replicas of the same thing. Conflicts don't exist. 

The share of immutable data from analytic systems like Hadoop and object stores like Amazon S3 is increasing rapidly, which means that there is an increasing number of applications for which CAP is not the only or even a major design consideration. It might be in the guts of the system but it's just one of many problems at that level and there may be multiple choices. The original Hadoop architecture actually ignored CAP for one critical part of the system--the NameNode, which maps HDFS file names to storage, was a single point of failure.

Which brings us back to understanding CAP at a practical level.  Is it Excalibur or just the rusty knife?  At this point it feels like another tool in the toolbox that you use at the right time, albeit carefully. Imagine a band saw that does not have a very good guard on the blade. Here are my personal instructions for safe use. 

1. Use it for suitable problems.  The CAP theorem applies to a very specific problem involving systems that want to remain consistent and available across multiple networked hosts.  If you design clusters or distributed databases, this is a relatively big deal. The trade-offs are real and you have to think about them. 

For instance at Continuent we have some problems where the theorem is directly applicable. We build clusters that implement failover.  We have to consider how to establish consensus while keeping the cluster available even when members lose messages or respond slowly. The CAP theorem guides you to manage this kind of problem rather than try to solve it using techniques that will not work, such as adding timeouts on messages.  (Continuent Tungsten clusters are generally CP, in case you  are wondering.) 

2. Avoid CAP where it does not obviously apply. It is a tricky theorem to interpret correctly, and many applications are concerned with unrelated problems. I work a lot on transactional replication. There are no CAP issues in Tungsten Replicator.  At the other end of the spectrum if you build systems that link multiple stores using replication, you likely have multiple CAP choices under the covers.  That's a common pattern in complex applications. 

It is therefore important to look with a jaundiced eye upon any product that claims to "beat CAP," like this widely read article. This is just marketing hype. If your application matches the CAP theorem model, it applies and you are subject to the limitations. If the limitations don't seem to make sense you have not evaded them. You are either working on a problem to which CAP is not relevant or you made implicit CAP choices of which you are not aware. It is easy to make a fool of yourself by asserting otherwise. 

3. Other tools are important too.  CAP of course does not even cover all trade-offs in clusters.  There are also many issues to consider when building distributed data systems that actually work.  Latency, durability of data, monitoring, automation, reliability, ability to do zero-downtime maintenance, and security are critical. Especially security. That looks like the next big problem for a lot of existing distributed systems. 

Beyond these, don't stop thinking about CAP. It is one of those ideas that gets under your skin and really bugs you. In addition to Eric Brewer's 2012 article, Seth Gilbert and Nancy Lynch wrote a follow-up perspective on the implications of CAP, so even the originators are continuing to consider the problems. The long term value of CAP is that it has focused attention on a set of difficult data management problems and led to numerous productive ideas about how to manage them. The resulting evolution is not nearly finished.  We will all continue to worry this bone for many years to come. 

Scaling Databases Using Commodity Hardware and Shared-Nothing Design