Friday, April 3, 2009

Contemplating the MySQL Diaspora

The break-up of the MySQL codeline is finally attracting attention from polite society outside the open source database community. This attention has been accompanied by much speculation, some of it informed and some not so informed about what is driving the split. Since everyone else is chipping in theories about how and why, here's mine:

It's the economy, stupid.

First, MySQL AB seeded a huge market for the MySQL database. MySQL 5.1 for all the controversy hit a million downloads in a little over a month. This is open source success on a grand scale that has created a huge pent-up demand for bug fixes as well as new features from a wide variety of users. Leaving aside consideration of Sun/MySQL misteps, it's somewhat hard to see how Sun would meet the competing market demands and still keep the database simple enough for everyone to use easily.

Second, the core MySQL server code is licensed under GPL V2, so anyone can take a copy, modify it, and create their own distribution. There is abundant proof from companies like Percona and many others that you can create viable businesses by offering services on these distributions without owning the code. That's critical because it means alternative branches are economically viable.

Third, pure open source projects can innovate very rapidly because they can accept contributions from the entire community. However, not everyone can or will merge the same patches--the Google semi-synchronous patch is a good example of a very useful patch that is also non-trivial to merge. So the split between branches is likely to increase over time depending on which part of the MySQL market each project chooses to serve. That's not even considering more-or-less full breaks like Drizzle.

OK, maybe it's cheating to steal catchy lines from James Carville, but this looks like simple economics at work. There is a huge market, plenty of room for businesses that don't own the code, and lots of opportunities for alternative versions.

There are arguments from people like Jeremy Zawodny that MySQL will hold together like the Linux kernel with different distributions around a common core. Once you get a lot of participants that kind of standardization is tough to manage. In fact one of the real strengths of open source development is that it does not follow standards. Sun no longer really controls the core of MySQL, and there are a lot of motivations to change it.

In the end what's happening to MySQL looks a bit like the fracturing of Unix in the 1980s--the BSD and System V variants quickly evolved into a separate version for each hardware vendor. There were various attempts to standardize, but they weren't especially successful. Instead, Intel undermined the proprietary chip model which in turn made the other hardware vendors less viable. Now we all run Linux or Windows.

One final thing--what does this mean for users? I think Jeremy has it right that at some point it does not matter. We are at the beginning of an era of multiple viable choices for open source databases. Some users will choose one of the new MySQL builds. Some users will jump ship to PostgreSQL. However, people being what they are, a lot of users will just stick with the version that they are currently running. In spite of other misfortunes that should be at least some consolation for Sun.

7 comments:

Adrian Klaver said...

The basic problem is different expectations. MySQL grew up as the database for web applications, where data is treated as flat files. MySQL 5.X is an attempt to bring the full power of the relational model to bear and make MySQL enterprise material. This is what Sun paid 1 Billion for. What it got was something less than that, see Monty's comments. Now there is fork in the user base between those that want the simple model of the 4.X series and those that want the enterprise features promised in the 5.X series. What we are seeing is the race to see who can be the first to satisfy the respective usage patterns.

Robert Hodges said...

Your comment seems to miss a key constituency, namely people who are running reasonably vanilla applications on MySQL 5.0 to 5.1 releases and just want bugs fixed. There are at least three forks now aimed at those users.

Adrian Klaver said...

My guess is at some point they are going to have to make a choice, roll back to the 4.X model or commit fully to the 5.X model. While the ability to fork Open Source projects is great, it also can become a burden. Right now we are seeing a MySQL diaspora, but at some point the forks are going to coalesce. I still hold that it will be around the two usage patterns I described earlier. Those who choose to buck that are going to be left out in the cold.

krow said...

Hi!

Personally I think that we are hitting a period of renaissance in the code base. This really is the first time that so many individuals have been asking the question of "what if". I believe we are just at the starting point for what could be a period of reflection on what has been created and what we need to see done going forward.

Cheers,
-Brian

Robert Hodges said...

Hi Brian!

I'm completely with you on the notion of a database renaissance. There's another way of reading the economics--it now makes the experiments viable. There is room for at least 3 variants of MySQL, maybe more. Adrian counted two, but the market is really large.

That means there is also another perfectly reasonable conclusion to the end of the article: It's going to be really fun working on databases for quite some time.

Cheers, Robert

Adrian Klaver said...

If you are interested in the future of databases then might I suggest you attend LinuxFest NW (www.linuxfestnorthwest.org/).
See the blog(//lfnw.wordpress.com/) for information on PgDay, as well as Monty Widenius's talk. His description of his talk:
"MariaDB (MySQL branch) and Maria Engine (transactional storage engine for MySQL. The presentation will be an open environment, where the audience helps to decide the topics...and the discussion is open for debate."

Sounds like a continuation of this discussion.

Robert Hodges said...

Hi Adrian,

Bummer--I'm going to miss this. It's also an excellent time for Spring skiing on Mt. Baker. :( Have fun and give my regards to Josh Drake.

Cheers, Robert

Scaling Databases Using Commodity Hardware and Shared-Nothing Design