The Scale-Out Blog: Java

Showing posts with label Java. Show all posts

Jan 1, 2013

Questions about MariaDB JDBC Driver

The recent release of the MariaDB client libraries has prompted questions about their purpose as well as provenance. Colin Charles posted that some of these would be answered in the very near future. I have a couple of specific questions about the MariaDB JDBC driver, which I hope will be addressed at that time.

1.) What is really in the MariaDB JDBC driver and how exactly does it differ from the drizzle JDBC driver? What, if any, relation is there to Connector/J code? There is a JIRA project but it contains only four bugs, hence is not very informative. The launchpad bzr history shows detailed check-ins but not overall intent.

2.) Why relicense from BSD to LGPL? I have checked the class headers and so far as attributions are concerned everything seems to be done quite properly. However, the license change appears to prevent those of us currently using the drizzle JDBC driver from transferring code changes back to the drizzle driver. If so, that seems a little unneighborly.

Here is some background on the relationship between the drivers. The MariaDB JDBC client is a fork of the BSD-licensed drizzle JDBC driver originally developed by Marcus Eriksson, who continues to maintain the code. According to the bzr change history the code forked after rev 253, which was 24 April 2011. There are still many similarities in the Java classes. For instance, a number of classes in the org.mariadb.jdbc.internal.common package differ by little other than licensing headers and package names. The MariaDB code is now up to rev 375 and includes substantial changes that appear to be designed to bring the MariaDB JDBC driver closer to the capabilities of the MySQL Connector/J driver.

At Continuent we have a lively interest in the drizzle JDBC driver, as we adopted it for Tungsten Replicator some time ago. The code had fewer bugs than Connector/J, which was attractive. More importantly, Marcus kindly accepted a patch from my colleague Stephane Giron (working as a Continuent employee) that made it easy for us to send queries using binary data rather than the usual Unicode data required by the JDBC standard. This fix allows Tungsten to replicate codesets and binary data correctly. We have since contributed a few other patches. Our modest contribution in part reflects the quality of the base code.

While waiting for answers I would like to commend Marcus as well as other drizzle contributors for their work. We are particularly indebted to Marcus for starting and continuing the drizzle JDBC project. Tungsten Replicator users have applied many trillions of transactions using the drizzle driver. If the MariaDB JDBC driver gains wide acceptance, the rest of the MySQL community owes Marcus Eriksson substantial thanks as well.

Sep 3, 2012

Life in the Amazon Jungle

In late 2011 I attended a lecture by John Wilkes on Google compute clusters, which link thousands of commodity computers into huge task processing systems. At this scale hardware faults are common. Google puts a lot of effort into making failures harmless by managing hardware efficiently and using fault-tolerant application programming models. This is not just good for application up-time. It also allows Google to operate on cheaper hardware with higher failure rates, hence offers a competitive advantage in data center operation.

It's becoming apparent we all have to think like Google to run applications successfully in the cloud. At Continuent we run our IT and an increasing amount of QA and development on Amazon Web Services (AWS). During the months of July and August 2012 at least 3 of our EC2 instances were decommissioned or otherwise degraded due to hardware problems. One of the instances hosted our main website www.continuent.com.

In Amazon failures are common and may occur with no warning. You have minimal ability to avoid them or some cases even understand the root causes. To survive in this environment, applications need to obey a new law of the jungle. Here are the rules as I understand them.

First, build clusters of redundant services. The www.continuent.com failure brought our site down for a couple of hours until we could switch to a backup instance. Redundant means up and ready to handle traffic now, not after a bridge call to decide what to do. We protect our MySQL servers by replicating data cross-region using Tungsten, but the website is an Apache server that runs on a separate EC2 instance. Lesson learned. Make everything a cluster and load balance traffic onto individual services so applications do not have to do anything special to connect.

Second, make applications fault-tolerant. Remote services can fail outright, respond very slowly, or hang. To live through these problems apply time-honored methods to create loosely coupled systems that degrade gracefully during service outages and repair themselves automatically when service returns. Here are some of my favorites.

If your application has a feature that depends on a remote service and that service fails, turn off the feature but don't stop the whole application.
Partition features so that your applications operate where possible on data copies. Learn how to build caches that do not have distributed consistency problems.
Substitute message queues for synchronous network calls.
Set timeouts on network calls to prevent applications from hanging indefinitely. In Java you usually do this by putting the calls in a separate thread.
Use thread pools to limit calls to remote services so that your application does not explode when those services are unavailable or fail to respond quickly.
Add auto-retry so that applications reconnect to services when they are available again.
Add catch-all exception handling to deal with unexpected errors from failed services. In Java this means catching RuntimeException or even Throwable to ensure it gets properly handled.
Build in monitoring to report problems quickly and help you understand failures you have not previously seen.

Third, revel in failure. Netflix takes this to an extreme with Chaos Monkey, which introduces failures in running systems. Another approach is to build scaffolding into applications so operations fail randomly. We use that technique (among others) to test clusters. In deployed database clusters I like to check regularly that any node can become the master and that you can recover any failed node. However, this is just the beginning. There are many, many ways that things can fail. It is better to provoke the problems yourself than have them occur for the first time when something bad happens in Amazon.

There is nothing new about these suggestions. That said, the Netflix approach exposes the difference between cloud operation and traditional enterprise computing. If you play the game applications will stay up 24x7 in this rougher landscape and you can tap into the flexible cost structure and rapid scaling of Amazon. The shift feels similar to using database transactions or eliminating GOTO statements--just something we all need to do in order to build better systems. There are big benefits to running in the cloud but you really need to step up your game to get them.

Jun 20, 2009

When SANs Go Bad

They sometimes go bad in completely unpredictable ways. Here's a problem I have now seen twice in production situations. A host boots up nicely and mounts file systems from the SAN. At some point a SAN switch (e.g., through a Fibrechannel controller) fails in such a way that the SAN goes away but the file system still appears visible to applications.

This kind of problem is an example of a Byzantine fault where a system does not fail cleanly but instead starts to behave in a completely arbitrary manner. It seems that you can get into a state where the in-memory representation of the file system inodes is intact but the underlying storage is non-responsive. The non-responsive file system in turn can make operating system processes go a little crazy. They continue to operate but show bizarre failures or hang. The result is problems that may not be diagnosed or even detected for hours.

What to do about this type of failure? Here are some ideas.

Be careful what you put on the SAN. Log files and other local data should not go onto the SAN. Use local files with syslog instead. Think about it: your application is sick and trying to write a log message to tell you about it on a non-responsive file system. In fact, if you have a robust scale-out architecture, don't use a SAN at all. Use database replication and/or DRBD instead to protect your data.
Test the SAN configuration carefully, especially failover scenarios. What happens when the host fails from access one path to another? What happens when another host picks up the LUN from a "failed" host? Do you have fencing properly enabled?
Actively look for SAN failures. Write test files to each mounted file system and read them back as part of your regular monitoring. That way you know that the file system is fully "live."

The last idea gets at a core issue with SAN failures--they are rare, so it's not the first thing people think of when there is a problem. The first time this happened on one of my systems it was around 4am in the morning. It took a really long time to figure out what was going on. We didn't exactly feel like geniuses when we finally checked the file system.

SANs are great technology, but there is an increasingly large "literature" of SAN failures on the net, such as this overview from Arjen Lentz and this example of a typical failure. You need to design mission-critical systems with SAN failures in mind. Otherwise you may want to consider avoiding SAN use entirely.

Mar 29, 2009

Implementing Relaxed Consistency Database Clusters with Tungsten SQL Router

In December 2007 Werner Vogels posted a blog article entitled Eventual Consistency, since updated with a new article entitled Eventually Consistent - Revisited. In a nutshell it described how to scale databases horizontally across nodes by systematically trading off availability, strict data consistency, and partition resilience as defined by the CAP theorem. According to CAP, you can only have two of three of these properties at any one time. The route to highly available and performant databases, according to Vogels, is eventual consistency in which distributed database contents at some point converge to a single value but at any given time may be inconsistent across replicas. This is the idea behind databases like SimpleDB.

I read the original blog article at about 2am on a Sunday morning. It was like a thunderclap. Like transactions and state machines, CAP was one of those ideas that provide instant clarity to a large class of problems, in this case related to database availability and performance. But it also raised an immediate question: can't we apply CAP systematically on conventional SQL databases? That way you don't have to throw away the relational database baby away with the strict consistency bathwater.

This is not an implausible idea. Most database engines have built-in master/slave replication to at least some degree, so there's no problem distributing data. (Shameless plug: If you don't like what your database provides, try ours.) The real problem is that you need to change how applications access the database. They need to implement CAP trade-offs in a consistent and easily understandable way. That's where the Tungsten SQL Router comes in.

Tungsten SQL Router is a thin Java JDBC driver wrapper that enhances conventional JDBC drivers to implement database session semantics based on CAP. SQL Router adds a "quality of service" or qos to each database session. Being programmers we had to invent our own terms, so here are the initial qos values.

RW_STRICT -- This session is used for writes; all data are strictly consistent, i.e., appear to all applications on RW_STRICT sessions as soon as they are written. In CAP terms you are picking data consistency + partition tolerance. (Vogel's article uses the term "causal consistency.")
RO_RELAXED -- This session is used for reads; data consistency is relaxed, i.e., represents data at an earlier point in time. In CAP terms you are picking availability + partition tolerance. (Vogel's article uses the term "monotonic reads.")

Clients can request the preferred quality of service whenever they create a new transaction. The SQL Router then takes care of connecting to a database that meets the semantics. Here's a typical Tungsten SQL Router URL (i.e., connection string) that routes connections to a MySQL master database:

jdbc:t-router://myservice/mydb?qos=RW_STRICT&createDatabaseIfNotExist=true

The SQL Router only steps in to select connections and to break them as necessary when databases go offline. It has almost no performance impact on Java applications, because we don't touch result sets and there are no proxy hops. That's an important requirement to achieve maximal application transparency.

Making CAP work properly for conventional applications is not entirely straightforward, which is one of the main reasons why you don't want the logic to be a part of your application. Here are some of the key features that Tungsten SQL Router provides.

Distributed database services. SQL Router groups databases into "services." Each database in the service is defined using a simple resource file that defines its name, location, and role (e.g., master or slave).
Remote management interfaces. Databases fail or go offline for maintenance and cluster resources change over time. Strict consistency connections in fact explicitly choose to fail when the database is no longer available rather than access old data, so you must handle failover easily. Tungsten SQL Router has a built-in JMX administrative interface that allows you to promote a slave database to become a master, take databases offline, bring them back online, etc., without disturbing or even necessarily notifying applications.
Support for non-Java applications. The world is a diverse place and not every application is written in Java. You can embed the SQL Router in the Tungsten Connector, a proxy that allows native MySQL and PostgreSQL applications (Perl, PHP, Ruby, name your favorite...) to connect without library changes or even changing connection strings.
Integration with connection pools. SQL Router provides call-backs that can be used to let application connection pools know when to give up applications. A little cooperation here makes things like failover work much more smoothly.

There are other features but it's probably simplest if you visit the Tungsten Project on SourceForge.net, read the wiki documentation, download a copy, and try it out for yourself. There's general information about Tungsten on our community website. Note: our community site is due for an update shortly to add more information about SQL Router and other new projects we are releasing. For the next few days please check out SourceForge.net.

Finally, here's an interesting thought that shows the power of applying CAP semantics in SQL applications. So far we have been talking about database replicas. However, SQL Router relaxed consistency sessions could just as easily read query data from a distributed cache like memcached. An application that specifies qos=RO_RELAXED on a connection is saying it will accept possibly out-of-date data in return for availability. Semantically there is no difference between a cache and a database replica--you can substitute any conforming storage implementation. Exploiting that idea pretty much defines our long-term roadmap for the SQL Router.

In summary SQL Router provides a simple model so that applications can choose whether they want availability or full data consistency while ensuring basic properties like partition resilience. These semantics are key to extending the scale-out database design model to increasingly large clusters, and equally important, to make that model easy to use for clusters of all sizes. Tungsten SQL Router is a work in progress, but the idea of using CAP semantics really seems to have legs. I hope you will try it out and let us know what you think.

p.s., I would like thank David Van Couvering for pointing out Werner Vogel's article in his blog as well as my colleague Ed Archibald for getting the SQL Router off the ground. Nice working with you guys. :)

Mar 17, 2009

Announcing Tungsten Monitor

Yesterday I posted about our release of Tungsten FSM, a package for building state machines in Java. Today we are publishing Tungsten Monitor, the second of four new Tungsten packages we are releasing on SourceForge.net during the month of March. Tungsten Monitor offers fast, pluggable resource monitoring for database clusters. We have a couple of specific monitor types already implemented; you can add new ones with minimal Java code.

Tungsten Monitor is focused on a single problem: providing continuous notifications of the state of resources in a cluster. Each monitor executes a simple process loop to check the resource and broadcast the resulting status. The status information answers the following questions:

What type of resource is this?
What is its name?
Is the resource up or down?
If a replica database, what is the latency compared to a master?

Tungsten Monitor is exceedingly simple--the current version has a total of 13 Java classes. (Who says Java needs to be complex?) Despite this simplicity the monitor has at least three very interesting features.

First, there's no centralized agent framework. You just start a monitor on a replicator, database, etc., and it starts to generate notifications. The off-the-shelf configuration monitors Tungsten Replicator--to get started you unpack code, copy one file, and start. That's it. The monitor figures out everything else automatically.

Second, Tungsten Monitor broadcasts notifications using group communications, which is a protocol for reliable, broadcast messaging between a set of processes known as a "group." Manager processes can tell that a new resource is available because its monitor joins the group and starts to broadcast notifications that the resource is up. If the new resource is a database, this is enough for a smart proxying service to start sending traffic to it automatically without any further management intervention.

Third, you can add new resource checks and notification methods yourself. With just 13 Java classes to start, you obviously won't be getting a lot off the shelf. :) We will add MySQL, PostgreSQL, and Oracle monitors shortly but any competent Java programmer could do the same in an hour or two without waiting for us.

In the meantime, you can download Tungsten Monitor source and binary builds from SourceForge.net. Also, here's a wiki article that describes basic installation and configuration. Take it for a spin, especially if you are using Tungsten Replicator already.

Interested? I hope so. Tungsten Monitor integrates with another of our March projects to route SQL connections automatically to available databases in a master/slave or master/master cluster. Stay tuned for more on that next week. Things are just starting to get interesting around here.

p.s., I did a bit of coding on the monitor but it really belongs to Gillles Rayrat, a colleague of mine in Grenoble. Nice job, Gilles!

Mar 15, 2009

Announcing Tungsten Finite State Machine Library

It is my pleasure to announce publication of the Tungsten Finite State Machine Library (Tungsten FSM) as an open source package hosted on SourceForge.net. This is the first of four new components for database clustering and replication that we will be releasing into open source during the month of March.

Tungsten FSM is a Java library for implementing in-memory state machines. It is lightweight and very simple to use. Each Tungsten FSM state machine tracks the state of a particular instance (i.e., object) in Java. The model provides standard features of state machines including states, transitions, actions, and error handling. Source and binary downloads are available here--there is also wiki documentation that explains how to use Tungsten FSM with code examples.

Here's a little background on the Tungsten FSM library and how it arose. State machines let you model the behavior of complex systems including system states and input/outputs in a simple and understandable way. They are as important for distributed systems as transactions are for databases. Among other things, state machines enable you to ensure that network services behave deterministically when presented with multiple, concurrent inputs. That determinism in turn is a fundamental requirement for organizing groups of processes into database clusters, which is what we do at Continuent.

When we embarked on development of Tungsten Replicator, it was immediately apparent we would need to implement state machines. Most of the available libraries, such JBoss jBPM, were heavyweight or otherwise difficult to embed. We therefore wrote a lightweight library of our own, adding features as we ran into practical implementation issues like dealing with errors. Tungsten FSM helps us organize code in services--for instance, it has been very easy to add new administrative operations to the Tungsten Replicator simply by adding more state transitions.

However, you don't have to take my word for it. Try out Tungsten FSM and let me know how you like it. For more information on Tungsten in general, please visit our community pages.

p.s., I'm posting this article to aggregators for MySQL and PostgreSQL even though it's not directly related to databases per se. State machines turn out to be essential to database clustering and management, as you'll see in some of the succeeding articles on this blog.

Jan 21, 2009

Encrypting Replicated Data

Open source projects have a great way of pulling in original ideas. Take encrypting replicated SQL: it's a big deal when you are are not sure about network security. This problem comes up very quickly when you transmit data over WAN links or in shared/cloud environments.

I have been procrastinating on implementation of encryption in the Tungsten Replicator because it felt as if we were going to have to do some surgery for it to work correctly. (Another hypothesis is that I'm lazy.) However, this morning I was talking via Skype to Mark Stenbäck, an engineer from Finland whom we met through a recent conference. Mark had a great idea. Why not just write an event filter? An excellent thought...Here's how it could work.

Filters transform or alter SQL events in the Tungsten Replicator. Let's say that you are replicating using MySQL 5.0. All SQL events contain SQL statements, which look like the following in our transaction history storage. The field 'SQL(0)' is the statement. Everything else is metadata.

SEQ# = 97
- TIME = 2009-01-12 10:01:46.0
- EVENTID = 000282:20633
- SOURCEID = centos5d
- STATUS = 2
- SCHEMA = mats
- SQL(0) = DROP PROCEDURE IF EXISTS updateRow

The SQL statement is contained in a simple data structure that is easy to manipulate within filter code. Now suppose you write the following Java classes:

StatementEncryptionFilter -- Looks at the event and encrypts the statement, using secret key (2-way) encryption.
StatementDecryptionFilter -- Looks at the event and decrypts the statement using 2-way decryption.

Implementing encryption itself is not very difficult in Java. You can follow the cryptographic examples in the Java documentation to get something working quickly. You then add the filters into the Tungsten replicator.properties file. The configuration might look like the following:

# Pre-storage filter(s).  Runs on events before they are stored
# in the transaction history log.
replicator.prefilter=logger,encrypt

# Post-storage filter(s).  Runs on events after pulling them from
# storage and before handing off for application.
replicator.postfilter=decrypt,logger
...

# Encrypt/decrypt filter definitions.
replicator.filter.encrypt=\
 org.foo.tungsten.StatementEncryptionFilter
replicator.filter.encrypt.password=!encrypt?!
replicator.filter.decrypt=\
 org.foo.tungsten.StatementDecryptionFilter
replicator.filter.decrypt.password=!encrypt?!

There are some subtleties here--when there are multiple filters you need to make sure that the encryption filter is the last one called and similarly that the decryption filter is called first. Otherwise the SQL statements will look like garbage when other filters try to read them.

Another subtlety is row replication, which uses lists of objects to contain replicated values. These are harder to encrypt. In this case it might make sense to alter the Tungsten Replicator architecture to add a configurable storage filter that serializes and deserializes events in the transaction history log. This would be a nice feature addition. It's now JIRA TREP-199--we should be able to implement it within a couple of builds.

To get back to the topic of open source, Mark's suggestion is an excellent example of the power of working with a diverse group of people with similar interests. His idea seems straightforward to implement and also leads to a simple but powerful improvement to the replicator design.

Do you have any ultra-cool replication ideas? Post them on the Tungsten Replicator mailing list or just implement them directly. I'm going to set up a project for extensions to the replicator. Contributions are welcome--they will be open source and free to everyone. We'll gladly extend the Replicator to help make your ideas work. Meanwhile I'll keep posting other neat ideas for extensions. Please come on down and join the fun.

Jan 5, 2009

Using Tungsten Replicator Filters to Implement MySQL Time-Delayed Replication

Time-delayed replication is a useful feature that allows a slave to lag a fixed amount of time behind a master, thus providing a time window to recover from disasters like deleting a 10 million line table by accident. You just run over to the slave, turn off replication, and recover lost data, as the delayed updates mean it has yet to see your deadly mistake. It's a simple way to protect your administrative honor as well as your job.

Time-delayed replication has been on the MySQL to-do list since at least 2001. It's currently scheduled for release 6.0 and the fix is included in recent OurDelta builds. However, there's a very simple way to get the feature with Tungsten Replicator filters. This works for unadulterated MySQL 5.0 and 5.1 releases.

I wrote about filters in a previous post on the pluggable Tungsten Replicator architecture. Filters are hooks into the replicator that allow your code to examine, change, and drop events before applying them to another database. Here's the Java interface:

public interface Filter extends ReplicatorPlugin
{
public ReplDBMSEvent filter(ReplDBMSEvent event)
    throws ReplicatorException, InterruptedException;
}

Is there a simple way to use a filter for time-delayed replication? Actually, there is. It turns out that the ReplDBMSEvent passed into the filter includes a "source timestamp" that tells when the event was pulled from the log. Tungsten reads binlog events almost instantaneously, so we can use the source timestamp to approximate the time at which the update was written on the master. If we also assume that clocks are synchronized between master and slave hosts, it's a matter of minutes to write a simple filter that implements time-delayed replication.

The filter itself is surprisingly short, because all we have to do is compute how long until the SQL should be applied and then go to sleep for that period of time. There are no complex API calls or management gyrations because our filter runs inside the replicator. This took about 20 minutes to write in Eclipse. (By contrast, this blog article is now 2 hours and counting.) Here is all the code you need.

package com.continuent.tungsten.replicator.filter;

import java.sql.Timestamp;
import org.apache.log4j.Logger;
import com.continuent.tungsten.replicator.ReplicatorException;
import com.continuent.tungsten.replicator.event.ReplDBMSEvent;
import com.continuent.tungsten.replicator.plugin.PluginContext;

/**
* Filter to delay a transaction until a particular point in time has passed.
* The time delay filter uses the originating timestamp from the replication
* event to just time delays. We assume that clocks are synchronized to within
* some reasonable precision between event producers and consumers.
*/
public class TimeDelayFilter implements Filter
{
private static Logger logger =
   Logger.getLogger(TimeDelayFilter.class);
private long timeDelayMillis = 0;

/**
* Sets the time delay in seconds.
*/
public void setDelay(long timeDelaySeconds)
{
   timeDelayMillis = timeDelaySeconds * 1000;
}

public ReplDBMSEvent filter(ReplDBMSEvent event)
   throws ReplicatorException, InterruptedException
{
   // Compute the interval that we should delay.
   Timestamp sourceTstamp = event.getSourceTstamp();
   long futureTime = sourceTstamp.getTime() + timeDelayMillis;
   long intervalMillis = futureTime - System.currentTimeMillis();

   // Sleep until it is time to deliver this event.  We let
   // InterruptedException flow through or the replicator will not
   // be able to shut down.
   if (intervalMillis > 0)
       Thread.sleep(intervalMillis);

   return event;
}

public void configure(PluginContext context) throws ReplicatorException {
  logger.info("Time delay filtering: event delivery delay set to "
      + (timeDelayMillis / 1000) + " seconds");
}

public void prepare(PluginContext context) throws ReplicatorException {
}

public void release(PluginContext context) throws ReplicatorException {
}
}

User-written code lives in a directory called lib-ext. If you write your own filters you'll need to compile them into a JAR file using javac/jar. You need to put the JAR into the lib-ext directory of any slave replicator that uses the filter. Then you tell the replicator to run your filter by updating file replicator.properties, which contains all the configuration settings for Tungsten Replicator.

First, we tell the replicator that there is an "applier-side" filter that will run on events that are applied to the slave database. The property setting looks like this--it specifies that a filter nicknamed "delay" that processes events before we apply them.

# Post-storage filter selection.  Value must be one or more comma-separated
# logical filter names.
replicator.postfilter=delay

We now need to define the "delay" filter, which requires two more entries in replicator.properties, so that our filter will delay events by 5 minutes (300 seconds). Tungsten uses a very simple protocol for configuring filters--it reads the 'delay=300' property and assigns this automatically using the corresponding variable setter on the filter class.

# Time delay filter.  Should only be used on slaves, as it delays storage
# of new events on the master.  The time delay is in seconds.
replicator.filter.delay=\
  com.continuent.tungsten.replicator.filter.TimeDelayFilter
replicator.filter.delay.delay=300

If you now restart the replicator as a slave, it will read the new property values and start running. So does it work? You can test by inserting a row on the master database as follows:

mysql> create table foo (id int primary key, data varchar(25),
mysql>        creation_date timestamp);
Query OK, 0 rows affected (0.04 sec)

mysql> insert into foo(id, data) values(9, 'HELLO');
Query OK, 1 row affected (0.01 sec)

Now go over to the slave and try to select the row:

mysql> select * from foo;
ERROR 1146 (42S02): Table 'mats.foo' doesn't exist

It's not there. In fact, the table has not arrived yet either. However, if you try again in 5 minutes, you will get the following:

mysql> select * from foo;
+----+--------+---------------------+
| id | data   | creation_date       |
+----+--------+---------------------+
|  9 | HELLO  | 2009-01-05 16:13:33 |
+----+--------+---------------------+
1 row in set (0.00 sec)

As this post shows, filters are a very flexible way to bend Tungsten replication to your will without writing a lot of code. I did a demo of this filter to a friend and we thought of a number of variations on the time-delay idea--one of the more interesting ones is to call somebody to approve SQL statements that look dangerous. The filter would hold them up until this occurs.

The main objection to the Tungsten filter approach is that it requires you to implement using Java. This does not bother me especially, but a lot of otherwise quite nice people break into hives at the thought of writing Java code. Linas Virbalas, a colleague of mine, is working on a generic filter that connects to JavaScript. I'm really looking forward to see this work. It will open up a lot of interesting doors to extend replication flexibly and easily using scripts.

p.s., You don't have to write this filter yourself. The time-delay filter is checked into our SVN repository and will be released in Tungsten 1.0 beta-4, due out on or around January 9th, 2009. Stop by our community website learn more. Downloads are located on the Continuent Forge.

Sep 14, 2008

Java Service Wrapper Is Very Handy

If you write network services using Java, you should look into the Java Service Wrapper (JSW). The JSW turns Java programs from weak delicate creatures easily killed by an errant Ctrl-C into robust network services that boot up automatically, ignore most signals, and restart automatically following crashes. It's free for open source programs and has very reasonable licensing fees for commercial software.

We use JSW on several of our projects including the Tungsten Replicator and the Tungsten Connector. I just checked in a new project on our Tungsten Commons site with an Ant script that automatically copies the open source versions of JSW into a project directory with a conventional layout including bin and lib directories. Check it out here if you would like an example of how to automate addition of JSW wrappers to your own Java projects.