Monday, December 1, 2008

Don't Shy Away from MySQL 5.1!

MySQL 5.1 is GA. Let the fear and loathing begin. In a recent post Monty describes a number of problems that he feels should have prevented a GA declaration at this time. I like Monty's forthrightness immensely and his words have strongly influenced our work to develop the Tungsten Replicator. That said, I must respectfully disagree with his opinion.

It's hard to comment on overall quality of 5.1, though I have yet to hit any bugs personally after using it intermittently for almost a year. However, we have done a lot of work with MySQL row replication. Monty points out several bugs in the row replication implementation. Frankly, they would not hold me back. Row replication has so many advantages in eliminating strange corner cases from statement replication that it outweighs a few bugs. The MySQL 5.1 manual sums it up accurately:

Advantages of row-based replication:

  • All changes can be replicated. This is the safest form of replication.
Beyond issues like provable correctness, row replication is simply more flexible than statement replication. Heterogeneous replication is an obvious example. Our own Tungsten Replicator can replicate statements from MySQL 5.0 to Oracle. That's great if you use completely vanilla SQL and stick to int and varchar datatypes. For real applications, however, you need a data structure that transfers datatypes accurate and is easy to morph across schema differences. Similar reasoning applies when using replication for application upgrades. Finally, row replication is the only viable path for implementing parallel slave update, which is increasingly necessary on multi-core hosts. I can't speak directly for Mats Kindahl and other members of the replication team, but there's no doubt they see row replication as the foundation to solve a number of key problems.

For these and other reasons our team at Continuent has devoted quite a bit of effort to reading row updates in MySQL 5.1 binlogs. Obviously, we have some uses in mind that go well beyond simple MySQL to MySQL data transfer. However, I would not shy away from MySQL 5.1 if I were using native replication. Instead, I would be testing row replication today to see what problems it solves for me. Congratulations to the MySQL team for getting this feature out the door.


Mark Callaghan said...

Can you elaborate on provable correctness?

Robert Hodges said...

Gladly. Assuming that the binlog is a serial transaction history consisting of row updates corresponding to the serialized history of the original database, applying the transactions in serial order on another database will result in equivalent updates. This property does not hold for statement replication--think UPDATE with subselects for a rich field of counter-examples. You can also reorder such histories to allow things like parallel update.

Note that I didn't say that MySQL code itself is provably correct. In fact, as Monty's tasteful selection of replication bugs shows, it most definitely is not!

htct150 said...

I can't find the 5.1 version of MySQL? Where can i find it?

Scaling Databases Using Commodity Hardware and Shared-Nothing Design