Sep 11, 2008

MySQL 5.0 to 4.1 "Down-Version" Replication using Tungsten

A couple of months ago Mark Callaghan mentioned it would be very nice to have a replication product that could transfer data from newer to older versions of MySQL. Ever since then I have been interested in trying it with our new Tungsten Replicator. Today I finally got the chance.

I have a couple of Centos5 virtual machines running on my Mac that I have been using to test the latest Tungsten Replicator 0.9.1 build. I happen to have MySQL 5.0.22 (the antiquated version that comes with CentOS5) on one VM. I set up MySQL 4.1.22 on the other CentOS5 VM and tried to make it a slave of the 5.0 server using MySQL replication. The result was the following error message:

080911 15:25:13 [ERROR] Master reported an unrecognized MySQL version. Note that 4.1 slaves can't replicate a 5.0 or newer master.

This message was highly satisfactory. MySQL replication is not supposed to work down-version from 5.0 to 4.1.

Now to try it with the Tungsten Replicator. I followed the Tungsten Replicator manual instructions with the MySQL 5.0 host as master and the MySQL 4.1 host as the slave. It turns out the set-up is identical for both versions, which made this part very fast. I then issued the standard commands to bring up the master:
trepsvc start configure goOnline goMaster
followed by commands to start the slave:
trepsvc start configure goOnline
Now it was time to fire up mysql against the master database and enter some data.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 25 to server version: 5.0.22-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table foobar13 (id int, data varchar(25));
Query OK, 0 rows affected (0.12 sec)

mysql> insert into foobar13 values(1, 'first!!!');
Query OK, 1 row affected (0.00 sec)
However, over on the slave, nothing showed up. OK, I know we have never tested against MySQL 4.1, but what's up? Well, in the slave replicator log the following message appeared:

INFO | jvm 1 | 2008/09/11 23:05:06 | 2008-09-11 23:05:06,536 FATAL tungsten.replicator.NodeManager You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'SCHEMA IF NOT EXISTS tungsten' at line 1

Oops! The replicator tried to issue a CREATE SCHEMA command to create its catalog database. CREATE SCHEMA was only introduced in MySQL 5.0.2. Change this to CREATE DATABASE and run Ant to build and redeploy the code. Restart the slave and check the logs. They look clean this time. Now login to the slave database with mysql and look for the foobar13 table:

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 20 to server version: 4.1.22-standard-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> select * from foobar13;
| id | data |
| 1 | first!!! |

Cool, it worked. Replication from MySQL 5.0 to MySQL 4.1 successfully demonstrated.

We will have a much improved Tungsten Replicator 1.0 alpha build ready in a couple of days that includes this fix and many others. By the way, we are working on getting heterogeneous replication to work as well. I'll have a lot more to say about that in future posts.


Linas Virbalas said...

Hello Robert,

I just read your new blog comment. I would like to comment and ask some questions about it:
1. I understand that Tungsten Replicator is replicating SQL statements currently. That’s why this down-version replication would only work until we do not use any lower-MySQL-version-not-supported SQL statements. Eg. If a client would be issuing the CREATE SCHEMA on a master, it would fail on the slave. Is this correct?
2. Some time ago I had experience with a big migration from MySQL 4 to MySQL 5. We had a lot of SQL statement changes to do in a web app to get it working on MySQL 5. The company's data was critical (financial) so at that moment we would have really used down-version replication if it would have been available. Do I understand correctly, that for a case like that, Tungsten Replicator _will_ be useful, but only when it will have row replication instead of currently available SQL statement replication?

Thank you,
Linas Virbalas

Robert Hodges said...

Hi Linas!

On #1, that's correct that you have to ensure you use appropriate commands to replicate. DDL commands are especially delicate. We should add a filter mechanism ASAP that allows users to drop or morph replication events easily.

On #2, no, I think even statement events will still be useful. Basic insert/update/deletes are pretty fungible at the syntax level so they work across versions. The main issue is what happens when you have to change them to make changes backwards compatible for old schema. In that case it's easier if the replication events are Tuple events containing raw data. We plan to add these shortly.

(Full disclosure: Linas works at Continuent and sent these questions via email. I asked him to post them on the blog because they are of general interest.)