Sunday, March 15, 2009

Announcing Tungsten Finite State Machine Library

It is my pleasure to announce publication of the Tungsten Finite State Machine Library (Tungsten FSM) as an open source package hosted on SourceForge.net. This is the first of four new components for database clustering and replication that we will be releasing into open source during the month of March.

Tungsten FSM is a Java library for implementing in-memory state machines. It is lightweight and very simple to use. Each Tungsten FSM state machine tracks the state of a particular instance (i.e., object) in Java. The model provides standard features of state machines including states, transitions, actions, and error handling. Source and binary downloads are available here--there is also wiki documentation that explains how to use Tungsten FSM with code examples.

Here's a little background on the Tungsten FSM library and how it arose. State machines let you model the behavior of complex systems including system states and input/outputs in a simple and understandable way. They are as important for distributed systems as transactions are for databases. Among other things, state machines enable you to ensure that network services behave deterministically when presented with multiple, concurrent inputs. That determinism in turn is a fundamental requirement for organizing groups of processes into database clusters, which is what we do at Continuent.

When we embarked on development of Tungsten Replicator, it was immediately apparent we would need to implement state machines. Most of the available libraries, such JBoss jBPM, were heavyweight or otherwise difficult to embed. We therefore wrote a lightweight library of our own, adding features as we ran into practical implementation issues like dealing with errors. Tungsten FSM helps us organize code in services--for instance, it has been very easy to add new administrative operations to the Tungsten Replicator simply by adding more state transitions.

However, you don't have to take my word for it. Try out Tungsten FSM and let me know how you like it. For more information on Tungsten in general, please visit our community pages.

p.s., I'm posting this article to aggregators for MySQL and PostgreSQL even though it's not directly related to databases per se. State machines turn out to be essential to database clustering and management, as you'll see in some of the succeeding articles on this blog.

8 comments:

Philippe Lang said...

Hi,

I'm not able to access your wiki: is that normal?

Regards,

Philippe Lang

Robert Hodges said...

No it's not normal! I'm working on fixing permissions now. Thanks for posting!

Tom Baeyens said...

Could you list the concrete features that were missing in jBPM ? Or what blockers you encountered to embed jBPM ?

regards,
Tom Baeyens
Lead jBPM

Robert Hodges said...

Hi Tom!

Just to be clear, I started out planning to use jBPM, as I used it in the past and really liked it. However, it was not missing features--indeed quite the opposite. jBPM turned out to be a lot more complex than we needed. Tungsten FSM has fewer than 25 classes and no dependencies other than log4j. There's no external configuration as it's all driven from within the code itself. We write network services that need to be pretty low to the ground, so the simplicity was just what we needed. It was a little surprising to me that such a "micro-library" for state machines did not already exist or at least was not obvious from searching with Google. The programming model is apparently not as well known as it should be.

Cheers, Robert

p.s., jBPM is really good. Congrats on a great implementation.

Robert Hodges said...

@Philippe, the wiki should be fixed. The URL I used seems to have been incorrect in some way.

xasima said...

Could you please point to the differences with the
http://commons.apache.org/scxml/ project except the owning of the development (to doesn't depend on external libs)?

xasima said...

Is your FSM planned to be run on top of cluster... probably back ended with the distributed STM to synchronize state changes/transitions across multiple instances (so the state engine will be spawn across cluster, if several node need to be supervised by the same state model)?

If no, I can hardly see any differences with small and simple embeddable apache-scxml...

Robert Hodges said...

@xasima

I wrote FSM to deal with a very specific problem, which was to write reliable network services that were free from race conditions when processing external commands and events. FSM is much smaller than SCXML (about 4KLOC vs 24KLOC, 30 classes vs 144) and eschews XML configuration files, which I find ugly and hard to debug for large state machines. FSM also has fewer features. For example it does not support parallel state machines which are handy for some types of applications.

FSM state machines are therefore local to a single process and ephemeral. Within Tungsten we *do* deal with distributed state as a part of managemet but this is implemented using group communications rather than as an extension of FSM.

I looked at SCXML before writing FSM but found it more complex than I wanted. One more thing--at least for our applications FSM appears to be bug free. In particular, FSM offers very simple and strong guarantees for concurrent behavior as stage machines serialize when processing events. SCXML presumably does this as well but FSM is so small you can understand and reason about concurrent behavior very easily.

Scaling Databases Using Commodity Hardware and Shared-Nothing Design