The Scale-Out Blog: November 2011

Nov 18, 2011

Replicating Data Now and Then with Tungsten

What do cruise ship management software and data warehouses have in common? One answer: they both depend on intermittent data replication. Large vessels collect data to share with a home base whenever connectivity permits. If there is no connection, they just wait until later. Data warehouses also do not replicate constantly. Instead, it is often far faster to pool updates and load them in a single humongous batch using SQL COPY commands or native loaders. Replicating updates in this way is sometimes known as batch replication. Tungsten Replicator supports it quite easily.

To illustrate we will consider a Tungsten master/slave configuration. (Sample setup instructions here.) In this example MySQL-based web sales data upload to a data warehouse. The master receives constant updates, which then apply at controlled intervals on the slave.

The first step is to turn off the slave replicator. Login to the prod2 host and execute the following command.

$ trepctl offline

The prod2 slave will disconnect from the master as well as the data warehouse. Updates now accumulate on the master. We can turn on the slave to fetch and apply them all, then go offline again using one of three methods. The first method uses the current sequence number on the master. Here are sample commands to fetch and apply all transactions from the master up to the current master position.

$ trepctl -host prod1 status |grep appliedLastSeqno
appliedLastSeqno : 19600
$ trepctl online -seqno 19600
$ trepctl wait -state OFFLINE -limit 300

As you can see, the first command locates the master sequence number. The second command tells the slave to go online and replicate to sequence number 19600. Finally the third command waits until either slave is back in the offline state or 300 seconds elapse, whichever comes first. This is not strictly necessary for replication but is very handy for scripts, as it eliminates a potentially awkward polling loop.

The second method is to use the MySQL binlog position on the master. The idea is the same as the previous example. We get the master binlog position, then tell the slave to apply transactions to that point and go offline. Here's an example:

$ mysql -utungsten -psecret -hprod1 -e 'show master status'

+------------------+----------+--------------+------------------+

+------------------+----------+--------------+------------------+

| mysql-bin.002023 | 92395851 | | |

+------------------+----------+--------------+------------------+

$ trepctl online -event mysql-bin.002023:0000000092395851

$ trepctl wait -state OFFLINE -limit 300

Note in this example that you must pad the binlog offset out to 16 digits, which means you must add the extra zeros shown in bold. Tungsten compares native replication IDs as strings, so that we can handle other databases besides MySQL. This normally a minor inconvenience, unless you don't know the trick. In that case it could be a bit of a head-scratcher.

There is a final way to implement batch replication using Tungsten's built-in heartbeat mechanism. With this method we insert a named heartbeat event on the master, then ask the slave to replicate until the heartbeat appears. Here's an example:

$ trepctl -host prod1 heartbeat -name batch1

$ trepctl online -heartbeat batch1

$ trepctl wait -state OFFLINE -limit 300

This method is perhaps the simplest of all, because there is no need to check for either sequence numbers or binlog offsets on the master. The only downside is that you must have a master and a slave replicator to use it. It does not work with direct replication, in which a single replicator moves data from the master DBMS to the slave. (This limitation will be removed in the future when Issue 228 is fixed.)

When using any of these techniques, we may want to know whether Tungsten will really go offline at the correct point. Fortunately, there's a simple way to find out. The trepctl status command shows pending requests to go offline. Let's say you check status after requesting the slave to replicate to a heartbeat as in the previous example.

$ trepctl status

Processing status command...

NAME VALUE

---- -----

appliedLastEventId : mysql-bin.002023:0000000104369615;37978

appliedLastSeqno : 220126

appliedLatency : 470.589

...

offlineRequests : Offline at heartbeat event: batch1

...

state : ONLINE

timeInStateSeconds : 2.436

uptimeSeconds : 1742.0

Finished status command...

It is simple to see from the status output that Tungsten will go offline when it sees a heartbeat named batch1.

As this article shows, the trepctl online and trepctl wait commands make it very simple to implement batch replication. You can simplify still further by wrapping the commands in a short script written in your favorite scripting language. Either way you have a handy solution to a problem that affects a diverse set of applications.

This is not the end of Tungsten features to enable batch replication. Tungsten has a new applier that can submit transactions using CSV files, which is critical to load transactions quickly into data warehouses. We have been testing it out with Vertica, where early results show that it improves load performance by a factor of 100 or more in some cases. I will describe this new feature in an upcoming article.

Nov 16, 2011

Why So Many Proprietary Rewrites of MySQL and InnoDB?

Every couple of weeks or so I get marketing email from a Continuent competitor advertising a closed-source clone of MySQL. It is said to be pumped up on illegal substances and therefore the solution to all my problems. I like this sort of spam because it makes it easier to track what the neighbors are up to. However it does bring up a question. Why are so many companies offering what amount to proprietary replacements of MySQL? This does not mean alternative builds like Percona or MariaDB. It means products like Clustrix, Schooner, or Xeround, which replace MySQL entirely, or like ScaleDB, or Tokutek, which replace InnoDB.

There's nothing wrong with proprietary software, of course. And there is nothing wrong with rewriting things to make them better. The rewrites are a tribute to the vitality of the MySQL marketplace and in some cases quite clever as well. However, the proprietary offerings tend to obscure an important truth about MySQL. Most businesses that run on open source software have problems with MySQL management, not with MySQL itself.

Here is a simple example. Say you have 2 Terabytes in MySQL 5.1. How do you upgrade from MySQL 5.1 to 5.5 without incurring an application outage? This is a big problem for 24x7 web-facing applications. You don't need to rewrite MySQL to do zero-downtime upgrades. MySQL with InnoDB already works fine. You just need a way to shift connections transparently to a new master database, upgrade the old master, and shift back when you are done. Similar reasoning applies for slave provisioning, automated failover, spreading load over replicas to improve performance, or operating across multiple sites.

At Continuent we concluded a number of years ago that you don't need to change MySQL to manage data effectively. We therefore designed Tungsten Enterprise, Continuent's commercial clustering solution, to work with unaltered MySQL. Tungsten Enterprise uses master/slave replication (i.e., my favorite replicator), distributed management, and transparent connectivity to make a set of standard MySQL or PostgreSQL servers look like a single highly available DBMS that distributes load across all replicas. This architecture has tremendous advantages, because it complements the strengths of MySQL itself. Here are a few of the principle benefits.

Transparency. Everything runs on standard MySQL from initial development to large-scale deployment. Application code runs the same way on a dev laptop or production. Application bugs in production are reproducible on the laptop. Standard MySQL configuration and tuning also work, because this is standard MySQL.

No lock-in. Don't like Tungsten Enterprise? Use something else or revert back to simple MySQL. There's no need to change your database or migrate data.

Data integrity. InnoDB has had years to shake out bugs, especially those involving data corruption. There are still a few but they do not typically show up unless there is a bad hardware failure or you configure your system incorrectly. (Hint #1: don't use MyISAM.) Do you really want to give this up for a new store implementation?

Scalability. MySQL performance is improving constantly, driven by competition between builds, an active community, investment from Oracle and large web properties like Facebook. SSDs are also increasingly affordable and make a lot of performance problems evaporate. As MySQL improves in this and other areas, you get the benefits. The trick is to have a way to upgrade. I mentioned the MySQL 5.1 to 5.5 upgrade problem for precisely this reason.

Operational costs. There is a deep pool of administrators and administrative tools for MySQL. Thanks to books like High Performance MySQL, abundant talks, and a wealth of community resources as well as consulting, there is little mystery about how things work. I probably don't even need to discuss license costs.

Viability. MySQL is not going anywhere. Oracle is continuing to invest in the core database, and Percona, MariaDB and most important Microsoft will ensure Oracle stays on its toes. At Continuent we do our best to keep our friends at Oracle competitive on replication. Innovation on open source MySQL will continue for years to come. (Psst, MySQL guys at Oracle are welcome to come work for us. :)

Given the number of advantages that off-the-shelf MySQL confers, the real question is why our approach is not more popular. Actually it is. For all the marketing attention generated by proprietary MySQL or InnoDB rewrites, many hundreds of billions of transactions per day run on unaltered MySQL. Switching to proprietary versions of MySQL is a substantial wrench for most businesses, because the economics run so strongly in favor of open source DBMS. However, the open source tools for managing MySQL are by-and-large inadequate, in part because some of the problems turn out to be rather difficult to solve.

When we started to rethink database clustering at Continuent back in 2007, we therefore focused on solving the problems outside MySQL that make data management hard. That includes building fast replication with global transaction IDs, so you can fail over easily to up-to-date live replicas. It includes building distributed, rule-based management that has simple primitives like "recover" to fix a broken slave. It includes speedy, transparent connectivity that can spread reads intelligently across multiple servers and reroute connections transparently to allow maintenance without halting applications. Finally, it includes simplifying management so that users don't spend much time worrying about their data. These capabilities are now very robust and help customers handle hundreds of millions of transactions per day.

It is obvious off-the-shelf MySQL (and PostgreSQL too) are already very good and continuing to get better. For most users there is no need to migrate to proprietary offerings that give up the leverage conferred by open source databases. Tungsten Enterprise solves the difficult problems that are critical to building businesses on standard MySQL. If you are building new systems based on MySQL or scaling old ones you should look hard at what we have done.

Nov 13, 2011

I Really Dislike Anonymous Attacks

If you are interested in NoSQL databases (or maybe not) perhaps you have seen the anonymous "warning" about using MongoDB. It concludes with the following pious request:

Please take this warning seriously.

Now there are a lot of great resources about data management on the web but the aforementioned rant is not one of them. If you plan to write technical articles and have people take them seriously, here are a few tips.

Sign your name. Readers are more impressed when they see you are not afraid to stand behind your words.
Explain what problem you were trying to solve. Otherwise uncharitable readers might think you just started pumping information into a new database without thinking about possible consequences and now want to blame somebody else for your bad decision.
Explain how you could do better. Not all designs work out, so propose alternatives. Readers love to see authors demonstrate that they are not discouraged by adversity.

As for most of the points made by the anonymous author, all I can say is: well, duh!

MongoDB behavior with respect to global write locking and transaction durability is obvious from the official documentation. These features are not my cup of tea, but it's also not as if 10gen is hiding them either. Moreover, most people understand that new DBMS implementations have problems, not least of all losing data now and then. You usually pick them because they have features that make it worth putting up with the immaturity. I am not an expert on MongoDB, but I can say from experience it is amazingly easy to load JSON objects into it. The up-front usability alone demonstrates excellent engineering. I am sure for this reason that there are many other good features.

p.s., Here is a point-by-point response from 10gen, helpfully pointed out by Alex Popescu.

Nov 18, 2011

Replicating Data Now and Then with Tungsten

Nov 16, 2011

Why So Many Proprietary Rewrites of MySQL and InnoDB?

Nov 13, 2011

I Really Dislike Anonymous Attacks

Contributors

Blog Archive

Favorite Blogs