I have been procrastinating on implementation of encryption in the Tungsten Replicator because it felt as if we were going to have to do some surgery for it to work correctly. (Another hypothesis is that I'm lazy.) However, this morning I was talking via Skype to Mark Stenbäck, an engineer from Finland whom we met through a recent conference. Mark had a great idea. Why not just write an event filter? An excellent thought...Here's how it could work.
Filters transform or alter SQL events in the Tungsten Replicator. Let's say that you are replicating using MySQL 5.0. All SQL events contain SQL statements, which look like the following in our transaction history storage. The field 'SQL(0)' is the statement. Everything else is metadata.
The SQL statement is contained in a simple data structure that is easy to manipulate within filter code. Now suppose you write the following Java classes:
SEQ# = 97
- TIME = 2009-01-12 10:01:46.0
- EVENTID = 000282:20633
- SOURCEID = centos5d
- STATUS = 2
- SCHEMA = mats
- SQL(0) = DROP PROCEDURE IF EXISTS updateRow
- StatementEncryptionFilter -- Looks at the event and encrypts the statement, using secret key (2-way) encryption.
- StatementDecryptionFilter -- Looks at the event and decrypts the statement using 2-way decryption.
There are some subtleties here--when there are multiple filters you need to make sure that the encryption filter is the last one called and similarly that the decryption filter is called first. Otherwise the SQL statements will look like garbage when other filters try to read them.
# Pre-storage filter(s). Runs on events before they are stored
# in the transaction history log.
# Post-storage filter(s). Runs on events after pulling them from
# storage and before handing off for application.
# Encrypt/decrypt filter definitions.
Another subtlety is row replication, which uses lists of objects to contain replicated values. These are harder to encrypt. In this case it might make sense to alter the Tungsten Replicator architecture to add a configurable storage filter that serializes and deserializes events in the transaction history log. This would be a nice feature addition. It's now JIRA TREP-199--we should be able to implement it within a couple of builds.
To get back to the topic of open source, Mark's suggestion is an excellent example of the power of working with a diverse group of people with similar interests. His idea seems straightforward to implement and also leads to a simple but powerful improvement to the replicator design.
Do you have any ultra-cool replication ideas? Post them on the Tungsten Replicator mailing list or just implement them directly. I'm going to set up a project for extensions to the replicator. Contributions are welcome--they will be open source and free to everyone. We'll gladly extend the Replicator to help make your ideas work. Meanwhile I'll keep posting other neat ideas for extensions. Please come on down and join the fun.