Monday, November 17, 2008

Announcing Tungsten Replicator Beta for MySQL

Pluggable open source replication has arrived, at least in beta form. Today we are releasing Tungsten Replicator 1.0 Beta-1 with support for MySQL. This release is the next step in bringing advanced data replication capabilities to open source and has many improvements and bug fixes. It also (finally) has complete documentation. I would like to focus on an interesting feature that is fully developed in this build: pluggable replication.

I have blogged about our goals for Tungsten Replicator quite a bit, for instance here and here. We want the Replicator to be platform-independent and database-neutral. We also want it to be as flexible as possible, so that our users can:
  • Support new databases easily
  • Filter and transform SQL events flexibly
  • Replicate between databases and applications, messaging systems, or files that you don't traditionally combine with replication
It was clear from the start we needed to factor the design cleanly. The result was an architecture where the main moving parts are interchangeable plug-ins. Here's a picture:

There are three main types of plug-ins in Tungsten Replicator.
  • Extractors remove data from a source, usually a database.
  • Appliers put the events in a target, usually a database.
  • Filters transform or drop events after extraction or before application.
This sounds pretty simple and it is. But it turns out to be amazingly flexible. I'll just give one example.

Say you are using Memcached to hold pages for a media application. The media database is loaded from a "dumb" 3rd party feed piped in through mysql. Normally you would set up some sort of mechanism within the feed that connects to the database and then updates Memcached accordingly. Okay, that works. However, your feed processor just got a lot more complicated. Now there's a better way. You can write an Applier that converts SQL events from the database to Memcached calls to invalidate corresponding pages. Then you can write a Filter that throws away any SQL events you don't want to see. Voila! Problem solved. Because it works off the database log, this approach works no matter how you load the database. That's even better.

Tungsten Beta has a number of other interesting features beyond pluggable replication. Our next builds will support MySQL row replication fully and have much better heterogeneous replication. I'm going to cover these in future blog posts. Incidentally, MySQL 5.1 row replication is a highly enabling feature for many data integration problems. If you have not checked it out already, I hope our replication will motivate you to do so in the very near future.

Meanwhile, please download load the build and take it out for a spin. Builds, documentation, bug tracking, wikis and much more are available on our community site. Have fun!

Scaling Databases Using Commodity Hardware and Shared-Nothing Design