Jul 13, 2008

Myosotis Connector: a Fast SQL Proxy for MySQL and PostgreSQL

SQL proxies have been very much in the news lately, especially for open source databases. MySQL Proxy and PG-Pool are just two examples. Here is another proxy you should look at: Myosotis.

Myosotis is a 'native-client' to JDBC proxy for MySQL and PostgreSQL clients. We originally developed it to allow clients to attach to our Java-based middleware clusters without using a JDBC driver. Myosotis parses the native wire protocol request from the client, issues a corresponding JDBC call, and returns the results back to the client. As you can probably infer, it's written in Java. "Myosotis" incidentally is the scientific name for "Forget-Me-Not," a humble but strikingly beautiful flower.

Myosotis is still rather simple but it already has a couple of very interesting features. First, it works for both MySQL and PostgreSQL. That's a good start. Wire protocols are very time-consuming to implement. Another feature is that Myosotis is really fast. This deserves explanation and some proof.

As other people have discovered, proxying is very CPU-intensive. It also involves a lot of concurrency, since a proxy may have to manage hundreds or even thousands of connections. Java is already fast in single threads--after a few runs through method invocations, the JVM has compiled the bytecodes down to native machine code. In addition, Java uses multiple CPUs relatively efficiently. Myosotis uses a thread per connection. Java automatically schedules these on all CPUs and optimizes of memory access in multi-core environment.

We can show Myosotis throughput empirically using Bristlecone, an open source test framework we wrote to measure performance of database clusters. We test proxy throughput by issuing do-nothing queries as quickly as possible with varying numbers of threads. The following run compares Myosotis against a uni/cluster 2007.1 process (a much more complex commercial middleware clustering software) and MySQL Proxy 0.6.1 running without Lua scripts. The proxy test environment is a Dell SC 1425 with 4 cores running CentOS5 and MySQL 5.1.23.

The results are striking. Myosotis gets between 3000 and 3500 queries per second when 8 threads are simultaneously running queries. To demonstrate processor scaling, run htop when the Myosotis Connector is being tested. You see something like this--a nice distribution across 4 cores.
Myosotis is a very simple proxy now but it has the foundation to create something great. We have big plans for Myosotis--it's a key part of our Tungsten architecture for database scale-out, which we will be rolling out later in the summer. The next step is to add routing logic so that we can implement load balancing and failover. We'll be doing that over the next few months. Meanwhile, if you want to see how fast Java proxies for SQL can be, check us out at at http://myosotis.continuent.org.

p.s., If you want to repeat the test shown here on your own proxy, download Bristlecone and try it out. I used the ReadSimpleScenario test, which is specifically designed to check middleware latency.


Anonymous said...

Where is the download for Myosotis? The linked site has changed to something else (Tungsten). Please post an updated link. Thanks!

Robert Hodges said...

@Anonymous, code for the myosotis connector is available at SourceForge.net and you are welcome to download and build it. About 18 months ago we changed our licensing so that further myosotis development is closed source. We withdrew the binary builds because we felt it was confusing to have old copies of code with bugs in them.

As a compensation we fully open sourced MySQL replication technology in the Tungsten Replicator. It's available at http://code.google.com/p/tungsten-replicator/.

Anonymous said...

Just was curious to know if the new Replicator opensource has the nice features of read-write splitting as was there in sql router.

Also any comments on replicator be able to do sharding. As it says this on http://code.google.com/p/tungsten-replicator/ "Tungsten Replicator is a high performance, open source, data replication engine for MySQL. It offers a set of features that surpass any open source replicator available today: global transaction IDs to support failover, flexible transaction filtering, extensible transaction metadata, sharding, multiple replication services per process, high performance, and simple, well-documented operation."