Wednesday, April 29, 2009

Overcoming MySQL-to-Oracle Culture Shock

Migrating from Oracle to MySQL is not easy. A few weeks ago Baron Schwartz summarized the culture shock in 50 things to know before migrating Oracle to MySQL. It's a great article but as you read through the comments it's easy to forget that culture shock can run the other way.

For example, try building horizontally scaled systems. Oracle has excellent "small" database editions like SE and SE1. However, they lack built-in replication of the type provided by MySQL. Even simple and effective deployment patterns like master-master replication do not exist. The usual approach in the Oracle world is to use RAC + Enterprise Edition features like Streams and DataGuard. That's great for large enterprises, but it's not a good method for smaller businesses and start-ups.

We have been working for some time on a better answer. We are now opening up for general beta testing a commercial extension to our Tungsten Replicator to address replication for Oracle. The new extension adds a process to read Oracle redo logs but otherwise fits neatly into the overall replicator design. It works on Linux Oracle Editions from XE to EE.

Implementing Oracle replication has been a long and arduous effort. Oracle has a huge feature set and a correspondingly elaborate log. It is far more challenging to read than the MySQL binlog. We currently handle basic data types as well as DDL statements. Large object types and XML are on the way. The implementation is a step-by-step process and one that needs to be guided by close work with customers.

On the other hand, Oracle has the features to make advanced replication really work. Most Oracle DBAs know about supplemental logging, which among other things adds keys to data so you can identify updated rows unambiguously. However, there are also far more interesting features like flashback queries, which allow you to see the state of the database at earlier points in time. It makes generating SQL from log entries much easier because we can see the state of system catalogs as of the exact time each update occurred. Flashback query was not on Baron's list or the comments that followed, but it is one of the truly great features of Oracle databases.

If you are interested in alternatives for existing Oracle replication, I would like to encourage you to contact us at Continuent. We are looking for customers who want to work closely with us to build out economical Oracle replication support. MySQL has shown over the years the power of lightweight, simple-to-use replication. It's going to be pleasure to make it work on Oracle.

Finally, there needs to be a list of 50 things you need to know about migrating from MySQL to Oracle. Open source databases are popular not just because they offer free downloads. Simplicity of operation, replication, and support for incremental scale-out patterns are among the strengths of MySQL. It takes some thought and effort to translate them into Oracle.

p.s., Since I wrote this article Robert Treat obligingly started the Oracle to MySQL 50 things list. Several people chipped in to get it up to 50.

Sunday, April 26, 2009

Tungsten Replicator Build 1.0.1 Available

A new build of the Tungsten Replicator is now available. As you probably know from reading this blog Tungsten Replicator provides advanced open source replication for MySQL. There is also a commercial extension to support Oracle. Tungsten Replicator 1.0.1 includes a number of important improvements.
  • Much better performance -- Current benchmark results show throughput of up to 650 inserts per second using a single slave apply thread. We are well on the way to our goal of 1000 inserts per second.
  • Simplified management -- Replicator administration has been largely reduced to two commands: online and offline. There is an option to go online automatically at startup, which further simplifies operation and makes it easy for the replicator to operate as a service.
  • Easy-to-use consistency checks. You just type trepctl check database.tablename.
  • Lots of bug fixes and small improvements. Check the release notes in file README.UPGRADE.
We also have some great features on tap for the next couple of releases. An integrated flush operation to simplify failover, built-in backup/restore, and parallel replication are just a few. I'm particularly excited about parallel replication, as it has the potential to boost throughput into the 1000s of updates per second and to support sharding as well. You can track development progress on the Tungsten Replicator JIRA list.

For more information check out the Tungsten Replicator community pages. You can grab binary downloads or look at source code on the Tungsten project on SourceForge.net. The 1.0.1 build is a considerable improvement over the previous beta releases and I hope you will try it out. We look forward to your feedback.

Friday, April 24, 2009

MySQL Conference Impressions and Slides

"Interesting" was probably the most overused word at the MySQL Conference that just ended yesterday. Everyone is waiting to find out more about the Oracle acquisition of Sun. As a community we need to find some synonyms or things will become very tiresome. Personally I vote for intriguing.

Here are slides for my presentations at the MySQL Conference as well as the parallel Percona Performance is Everything Conference. Thanks to everyone to attended as well as to the organizers. You had wonderful ideas and suggestions.


Finally, some short impressions on the conference. The two most intriguing trends were advances in hardware, especially memory and SSDs, as well as clouds. These are altering the economics of computer in fundamental ways: business costs as well as performance trade-offs in many of the basic algorithms for data management. Combined with the ferment of projects spinning off from MySQL and others, they are fueling an incredible burst of creative thinking about databases.

By comparison, Oracle consuming Sun is merely interesting.

Tuesday, April 7, 2009

Tungsten Replicator at the 2009 MySQL UC

It's good to get out of the office and meet people. This year I'll be doing several presentations at the 2009 MySQL Conference and adjacent Percona Performance Conference in Santa Clara. These include among others a talk on Tungsten Replicator on Thursday April 23 at 10:50.

In case you don't read this blog regularly, Tungsten Replicator provides advanced open source replication for MySQL. The term "advanced" is not an exaggeration. I'll be covering how to solve practical problems including the following:
  • How to install and configure Tungsten Replicator in 5 minutes or less.
  • How to set up seamless slave promotion after a master fails.
  • How to prevent loss of data from administrative errors using time delay replication.
  • How to identify data inconsistencies using built-in checksums.
  • How to move data from higher to lower MySQL versions.
  • How to reduce slave latency by dropping DDL commands from replicator events.
  • How to recover quickly and easily from statements that fail to replicate properly.
  • How to replicate from MySQL to Oracle as well as things that are not even databases.
There will be some short demos along the way. I hope you'll join me for a fun and informative talk. And please bring interesting replication problems with you!

p.s., For extra credit download Tungsten Replicator and try it out before the talk. I look forward to your questions and comments.

Friday, April 3, 2009

Contemplating the MySQL Diaspora

The break-up of the MySQL codeline is finally attracting attention from polite society outside the open source database community. This attention has been accompanied by much speculation, some of it informed and some not so informed about what is driving the split. Since everyone else is chipping in theories about how and why, here's mine:

It's the economy, stupid.

First, MySQL AB seeded a huge market for the MySQL database. MySQL 5.1 for all the controversy hit a million downloads in a little over a month. This is open source success on a grand scale that has created a huge pent-up demand for bug fixes as well as new features from a wide variety of users. Leaving aside consideration of Sun/MySQL misteps, it's somewhat hard to see how Sun would meet the competing market demands and still keep the database simple enough for everyone to use easily.

Second, the core MySQL server code is licensed under GPL V2, so anyone can take a copy, modify it, and create their own distribution. There is abundant proof from companies like Percona and many others that you can create viable businesses by offering services on these distributions without owning the code. That's critical because it means alternative branches are economically viable.

Third, pure open source projects can innovate very rapidly because they can accept contributions from the entire community. However, not everyone can or will merge the same patches--the Google semi-synchronous patch is a good example of a very useful patch that is also non-trivial to merge. So the split between branches is likely to increase over time depending on which part of the MySQL market each project chooses to serve. That's not even considering more-or-less full breaks like Drizzle.

OK, maybe it's cheating to steal catchy lines from James Carville, but this looks like simple economics at work. There is a huge market, plenty of room for businesses that don't own the code, and lots of opportunities for alternative versions.

There are arguments from people like Jeremy Zawodny that MySQL will hold together like the Linux kernel with different distributions around a common core. Once you get a lot of participants that kind of standardization is tough to manage. In fact one of the real strengths of open source development is that it does not follow standards. Sun no longer really controls the core of MySQL, and there are a lot of motivations to change it.

In the end what's happening to MySQL looks a bit like the fracturing of Unix in the 1980s--the BSD and System V variants quickly evolved into a separate version for each hardware vendor. There were various attempts to standardize, but they weren't especially successful. Instead, Intel undermined the proprietary chip model which in turn made the other hardware vendors less viable. Now we all run Linux or Windows.

One final thing--what does this mean for users? I think Jeremy has it right that at some point it does not matter. We are at the beginning of an era of multiple viable choices for open source databases. Some users will choose one of the new MySQL builds. Some users will jump ship to PostgreSQL. However, people being what they are, a lot of users will just stick with the version that they are currently running. In spite of other misfortunes that should be at least some consolation for Sun.

Scaling Databases Using Commodity Hardware and Shared-Nothing Design