Thursday, December 27, 2012

The MySQL Community: Beleaguered or Better than Ever?

The  MariaDB Foundation announcement spawned some interesting commentary about the state of open source databases.  One recent headline cited the "beleaguered MySQL community." Beleaguered is a delightful adjective.  The OED tells us that it means beset, invested, or besieged.  Much as I like the word, I do not think it is an accurate or useful description of the MySQL community.  This article and others like it miss the point of what is happening to MySQL and its users.

Let's start by disproving that the notion that the MySQL community is beleaguered.  I don't know everyone who uses MySQL, but in my job I talk to numerous companies that have made sizable investments in MySQL and stand to lose big if they are wrong.  They do not seem especially nervous.

1.  Nobody seriously questions MySQL viability.  I have yet to meet a manager with a substantial business on MySQL who is deeply worried about it disappearing or being ruined by Oracle.  They are too busy working on software upgrades or keeping their sites running.  The future of MySQL is well down the list of problems keeping them awake at night.  

2.  MySQL meets or beats the immediate alternatives.  There is of course discussion about dropping MySQL for PostgreSQL but it is mostly idle talk.  I'm sure some companies have switched (actually in both directions), but I not seen a single customer migrate a working business app from MySQL to PostgreSQL.  Once you get past the religion, it's clear MySQL and PostgreSQL are just too similar to supplant each other easily:  reliable, row-based stores with single threaded SQL query engines that handle a few terabytes of data at most.  Companies need far stronger reasons to switch to something new, especially given the large ecosystem and deep pool of MySQL expertise. 

3. MySQL is not the only game in town.  Virtually every large web site I know uses at least one NoSQL store alongside MySQL.  Column stores are increasingly common for data warehouses.  Production Hadoop clusters are no longer a novelty.  On the surface this might look like a failure of MySQL.  What's really happening in many cases is that small businesses that started on MySQL are now large, profitable enterprises that require more than just economical OLTP.  This is a mark of success, not a deficiency.

If this is what beleaguered looks like I can't wait to see something that's actually successful.

Turning the argument around, can we say that the MySQL community is better than it was?  In at least one important way, yes.  The community is now multi-polar.  MySQL long benefitted from having a large community of open source users to find bugs, help focus development direction, and construct a wide range of robust tools like language bindings.  However, innovation on MySQL itself was largely gated by a single company:  MySQL AB.  Multiple groups are now competing to improve MySQL, and it's a very good thing for users.  Let me count the ways.

There are three major versions of MySQL:  Oracle, Percona, and MariaDB, not to mention cloud-only versions like Amazon RDS.  There are at least four companies working directly on major upgrades to replication:  Continuent, Oracle, Codership, and Monty Program.  Oracle is continuing to make improvements in InnoDB like online schema change and multi-core scaling, efforts that are complemented by Percona's persistent focus on all aspect of performance.  Aside from Amazon RDS, all of this work is available in open source, and there is an unusual degree of sharing across otherwise competitive groups.  I could keep going for a while but to be frank there's so much it's hard to track all the improvements or give them their proper due.

The MySQL community is therefore competitive in a way that did not exist a few years ago. That's good, because innovation in data management is no longer centered around the web-facing applications that MySQL helped enable. Businesses are grappling with massive data volumes that far exceed the capacity of single DBMS servers while simultaneously moving to Amazon or VMWare. There is a whole new set of problems such as deploying in unstable cloud environments, adjusting to polyglot persistence, managing sharded data effectively, distributing data across multiple regions, and enabling real-time analytics on MySQL transactions. As a group, the MySQL community is well-positioned to address them.

If there is a problem, it is how to keep a strong multi-polar community going for as long as possible.   Competition creates uncertainty for users, because change is a given.  Pointy-haired bosses have to make decisions with incomplete information or even reverse them later. Competition is hard for vendors, because it is more difficult to make money in efficient markets.  Competition even strikes against the vanity of community contributors, who have to try harder to get recognition.  It is clear there will be pressures to make the community less competitive.  They won't necessarily be from Oracle, which thrives on competition.

This gets back to the MariaDB Foundation reference that started this article. Anything that ensures long-term competitiveness and vitality of MySQL is good.   Foundations in general seem well suited to this task.  At Continuent we have already had some discussions about joining. So far we are undecided, for reasons that are somewhat similar to Peter Zaitsev's comments on this subject.  If the MariaDB Foundation helps maintain a stable multi-polar community, we're in.  

11 comments:

Robert Young said...

-- It is clear there will be pressures to make the community less competitive. They won't necessarily be from Oracle, which thrives on competition.

Really? Historically, Larry buys up any company that his paranoia tells him might, someday, somehow, be a competitor. In the case of Sun, it's become increasingly clear that neither SPARC nor java was the nexus. For a long time, Oracle had the MVCC database market to itself. Whether such databases actually make sense is for another conversation.

InnoDB made a MVCC engine for MySql. PG was lost in the wilderness with an Oracle work-alike that nobody wanted much. Suddenly, MySql became the insurgent database. Larry scarfed it up. Competition? Ruinous competition? Larry wants none of that.

Now that both IBM (on LUW) and MS have incorporated some measure of MVCC into their databases, we'll see what happens.

hingo said...

Hi Robert

While you are right that it is a futile effort to try to enumerate everything great going on in the MySQL world, I usually try to remember mentioning also the end-user employed engineers. For example, Facebook is nowadays employing way over 10 full time MySQL engineers and contributes code that is then consumed by all 3 vendors. Not all of these are officially developers (I think there are only 3-4 of those) but since the DBAs and performance guys are names like Domas, Yoshinori, Lachlan... all of them actually have contributed either one liners or surrounding utilities.

So the Facebook team essentially matches what you have at Percona or Monty's. Twitter, Craigslist, Taobao, etc... have smaller teams but constantly come up with valuable small improvements too. (Perhaps not so small in the case of multi-source replication from Taobao.)

Robert Hodges said...

@hingo, Thanks for pointing that out. I had a sentence in about both Facebook and Twitter, which are both doing excellent work on MySQL these days. My editing was a little aggressive. (Sorry Mark, Yoshinori, Domas, Harrison, Jeremy, ...)

Robert Hodges said...

@Robert Young, you are right that Oracle wanted to acquire MySQL. It was both a threat to big Oracle and a great addition to Oracle's portfolio. Building a portfolio that makes them the one-stop IT shop for their enterprise customers is fundamental to Oracle's strategy.

However, Oracle has not taken the path to stifle MySQL many feared, at least not in an obvious way. How else to explain the continued improvements to InnoDB, which (so far as I know) all appear in open source under GPL? It appears they are satisfied if they can neutralize the threat of a DBMS growing up to compete with big Oracle and at the same time make money. So far as I can tell it's working out pretty well for them.

Ivan said...

Hi Robert,

I tend to agree with your analysis, but I would be a bit more concerned about the future. It is not the possibility to migrate from MySQL to Postgres that concerns me, but the fact that every new project in town designed and made by brilliant minds - the very same lymph that made MySQL and LAMP great - is today based on No/NewSQL DBs. I do not have any problem with that, but you will soon see MySQL being relegated to an old-fashioned technology that will be attached to an era and not to innovative technology.

The fact is, NoSQL and NewSQL have their own issues and when developers find out these issues it is too late. I believe it is up to us, MySQL Community, users, developers and ecosystem, to improve MySQL in order to make it shine again and be the best choice for new initiatives.

Thanks!
-ivan

Robert Hodges said...

@Ivan, I doubt that NoSQL will supplant MySQL any time soon at least for OLTP purposes, because MySQL/InnoDB is (a) extremely fast, (b) offers strong transactional guarantees, and (c) has flexible query via SQL. The most likely outcome is that NoSQL will run alongside MySQL and handle problems like processing extremely large, non-OLTP datasets like user session data, chats, weblogs, web search indexes, etc. that don't fit into SQL very well. That's already in place now in many companies from Google down to market automation start-ups. The same thing is happening with column stores for data warehouses.

One interesting question is whether MySQL can over time absorb some of the more useful features of NoSQL such as transparent sharding. That would start to tip the balance back to MySQL again.

Ivan said...

Robert,

Sure, I am not saying that NoSQL will supplant MySQL "soon". Old school guys use MySQL and NoSQL together. New kids in the block go with the fanciest technology of the moment - and they will not revert to MySQL later.

I participate to Hadoop, Cassandra and MongoDB meetups regularly. At least based on my experience I see this all the time.

Robert Young said...

-- New kids in the block go with the fanciest technology of the moment

And that's been the basic issue with NoSql from the start: they're no more than siloed, non-transactional, no DRI, language/application/code files. Just like the COBOL/VSAM stuff the "New kids" grandfathers did in the 60s and 70s. At least their grandpappies had CICS to keep things from falling into anarchy. The "New kids" only think they've invented something new. They've merely re-formed square wheels. But such wheels present the opportunity to churn out lots more LoC, so, from the coders' point of view, it's Really New and Cool. Not.

Andrew Dunstan said...

If you haven't seen people migrating working apps from MySQL to PostgreSQL then you need to get out more. I know I am biased but this is something I see pretty regularly. Sometimes it's motivated by things like fear of Oracle, more often it's from desire for one or more of the advanced features PostgrteSQL offers, like Common Table Expressions, Window functions, or access to backend processing tools such as PLV8 or PL/R. I don't want to get into a database flame war, but you really should not assume this is not a common thing.

Robert Hodges said...

@Andrew, PG has a lot of good features. However, we have about 100 customers on MySQL at Continuent with more on the way and I have yet to see anybody switch. During that time I have seen dozens of companies switch to Percona builds and a smaller but growing number moving to MariaDB. These are for the most part OLTP systems of one kind or another.

That's not to say the switch to PG does not happen. It does. 451 Group has numbers that show PG as the most common direct replacement for MySQL. (See their May 2012 report on MySQL/NewSQL/NoSQL.) However, their numbers look a little muddled and obscure the amount of momentum behind NoSQL solutions, especially for large applications. As Ivan pointed out earlier it's hard not to be impressed by the number of people jumping into NoSQL implementations. I'm sure you have seen this at their conference.

The place where PG seems to be getting some rather interesting traction is analytics, mostly through derivatives like GreenPlum, Aster Data, Vertica (less and less a derivative with Vertica 6). Leaving aside anti-Oracle politics the more your applications depend on sophisticated query features, the better PG looks. For basic OLTP I would still give MySQL the nod because of its speed, deep experience pool, reliability (InnoDB), and outstanding replication.

Robert Hodges said...

P.s. Thanks for the tip on PL/R. I have not read up much on that. It looks really interesting.

Scaling Databases Using Commodity Hardware and Shared-Nothing Design