Last week I attended an incredibly intense conference in Lalandia, Denmark: Miracle Oracle Open World. According to Mogens Norgaard, the organizer, the conference devotes 80% of the time to intense discussions of Oracle databases and 80% of the time to drinking. During the festivities you get this dim mental image of what it would have been like if Vikings had access to 16-core machines and advanced database software. But I digress.
Anyway, Lalandia is located on just that kind of spare, beautiful coast that clears the mind to look for fundamental truths. And sure enough, a talk by Carel-Jan Engel, nailed one of them: simplicity is the key to availability.
At some level we all understand the idea. The more components you have in a system the more likely it is one or more of them will fail either because of a defect or an administrative error. The trouble is we don't act on our intuitions. Carel-Jan showed the Oracle MAA (Maximum Availability Architecture), which looks like this in the marketing pictures:
MAA is the recommended way to create a highly available system using RAC and Data Guard. And suddenly it hits you--there are a lot of moving parts. In seeking redundancy, the authors of the design have created tremendous complexity and hence opportunities for failures. It's an example of what Jeremiah Wilton once allegedly described as "design for maximum failability." I don't know if Jeremiah really said that but it describes the problem pretty well.
And this was Carel-Jan's point. Availability is not something you just purchase and roll in the door on wheels. You get it by engineering very simple systems that have few points of failure. In the Oracle world it often means buying Oracle SE instead of RAC. And running it on standard hardware linked together with replication. Plus, of course, changing your applications so they work within the limitations of the rest of the system. Want to stay available without losing data? Keep the rate of updates low. Performance overload? Partition data into separate systems. You get the idea.
In short, keep it really simple, like this:
This is simple availability. It's very beautiful. Open source database communities have understood this idea for a long time. My goal is to write software make it work better for them and for Oracle users as well.
Amazon RDS and pt-online-schema-change
5 hours ago