Sunday, March 29, 2009

Implementing Relaxed Consistency Database Clusters with Tungsten SQL Router

In December 2007 Werner Vogels posted a blog article entitled Eventual Consistency, since updated with a new article entitled Eventually Consistent - Revisited. In a nutshell it described how to scale databases horizontally across nodes by systematically trading off availability, strict data consistency, and partition resilience as defined by the CAP theorem. According to CAP, you can only have two of three of these properties at any one time. The route to highly available and performant databases, according to Vogels, is eventual consistency in which distributed database contents at some point converge to a single value but at any given time may be inconsistent across replicas. This is the idea behind databases like SimpleDB.

I read the original blog article at about 2am on a Sunday morning. It was like a thunderclap. Like transactions and state machines, CAP was one of those ideas that provide instant clarity to a large class of problems, in this case related to database availability and performance. But it also raised an immediate question: can't we apply CAP systematically on conventional SQL databases? That way you don't have to throw away the relational database baby away with the strict consistency bathwater.

This is not an implausible idea. Most database engines have built-in master/slave replication to at least some degree, so there's no problem distributing data. (Shameless plug: If you don't like what your database provides, try ours.) The real problem is that you need to change how applications access the database. They need to implement CAP trade-offs in a consistent and easily understandable way. That's where the Tungsten SQL Router comes in.

Tungsten SQL Router is a thin Java JDBC driver wrapper that enhances conventional JDBC drivers to implement database session semantics based on CAP. SQL Router adds a "quality of service" or qos to each database session. Being programmers we had to invent our own terms, so here are the initial qos values.
  • RW_STRICT -- This session is used for writes; all data are strictly consistent, i.e., appear to all applications on RW_STRICT sessions as soon as they are written. In CAP terms you are picking data consistency + partition tolerance. (Vogel's article uses the term "causal consistency.")
  • RO_RELAXED -- This session is used for reads; data consistency is relaxed, i.e., represents data at an earlier point in time. In CAP terms you are picking availability + partition tolerance. (Vogel's article uses the term "monotonic reads.")
Clients can request the preferred quality of service whenever they create a new transaction. The SQL Router then takes care of connecting to a database that meets the semantics. Here's a typical Tungsten SQL Router URL (i.e., connection string) that routes connections to a MySQL master database:
jdbc:t-router://myservice/mydb?qos=RW_STRICT&createDatabaseIfNotExist=true
The SQL Router only steps in to select connections and to break them as necessary when databases go offline. It has almost no performance impact on Java applications, because we don't touch result sets and there are no proxy hops. That's an important requirement to achieve maximal application transparency.

Making CAP work properly for conventional applications is not entirely straightforward, which is one of the main reasons why you don't want the logic to be a part of your application. Here are some of the key features that Tungsten SQL Router provides.
  • Distributed database services. SQL Router groups databases into "services." Each database in the service is defined using a simple resource file that defines its name, location, and role (e.g., master or slave).
  • Remote management interfaces. Databases fail or go offline for maintenance and cluster resources change over time. Strict consistency connections in fact explicitly choose to fail when the database is no longer available rather than access old data, so you must handle failover easily. Tungsten SQL Router has a built-in JMX administrative interface that allows you to promote a slave database to become a master, take databases offline, bring them back online, etc., without disturbing or even necessarily notifying applications.
  • Support for non-Java applications. The world is a diverse place and not every application is written in Java. You can embed the SQL Router in the Tungsten Connector, a proxy that allows native MySQL and PostgreSQL applications (Perl, PHP, Ruby, name your favorite...) to connect without library changes or even changing connection strings.
  • Integration with connection pools. SQL Router provides call-backs that can be used to let application connection pools know when to give up applications. A little cooperation here makes things like failover work much more smoothly.
There are other features but it's probably simplest if you visit the Tungsten Project on SourceForge.net, read the wiki documentation, download a copy, and try it out for yourself. There's general information about Tungsten on our community website. Note: our community site is due for an update shortly to add more information about SQL Router and other new projects we are releasing. For the next few days please check out SourceForge.net.

Finally, here's an interesting thought that shows the power of applying CAP semantics in SQL applications. So far we have been talking about database replicas. However, SQL Router relaxed consistency sessions could just as easily read query data from a distributed cache like memcached. An application that specifies qos=RO_RELAXED on a connection is saying it will accept possibly out-of-date data in return for availability. Semantically there is no difference between a cache and a database replica--you can substitute any conforming storage implementation. Exploiting that idea pretty much defines our long-term roadmap for the SQL Router.

In summary SQL Router provides a simple model so that applications can choose whether they want availability or full data consistency while ensuring basic properties like partition resilience. These semantics are key to extending the scale-out database design model to increasingly large clusters, and equally important, to make that model easy to use for clusters of all sizes. Tungsten SQL Router is a work in progress, but the idea of using CAP semantics really seems to have legs. I hope you will try it out and let us know what you think.

p.s., I would like thank David Van Couvering for pointing out Werner Vogel's article in his blog as well as my colleague Ed Archibald for getting the SQL Router off the ground. Nice working with you guys. :)

3 comments:

Kevin Rae said...

Robert-

I caught your presentation at the Percona Performance conference this week and was interested in learning more about the session consistenecy features available from the Tungsten SQL router. Is there a chance that you or someone from Continuent could get in touch to discuss how we can leverage this with our Postgres DBs?

Robert Hodges said...

Hi Kevin!

Glad you are interested! Email me at robert dot hodges at continuent dot com. :) As I mentioned during the talk session consistency is our next magic trick but I can already share the basic design.

Cheers, Robert

Anonymous said...

For session consistency, I suggest that you look at two papers:

K. Daudjee, K. Salem: Lazy Database Replication with Ordering Guarantees. ICDE 2004: 424-435
and
K. Daudjee, K. Salem: Lazy Database Replication with Snapshot Isolation. VLDB 2006: 715-726.

Scaling Databases Using Commodity Hardware and Shared-Nothing Design