tag:blogger.com,1999:blog-768233104244702633.post6218536983983660398..comments2023-11-16T03:16:54.746-08:00Comments on The Scale-Out Blog: Why Aren't All Data Immutable?Robert Hodgeshttp://www.blogger.com/profile/05379726998057344092noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-768233104244702633.post-22293510440655488092014-02-18T08:40:37.778-08:002014-02-18T08:40:37.778-08:00@Robert, Regarding MVCC there's no question yo...@Robert, Regarding MVCC there's no question you want to keep only a subset of data visible for most OLTP operations. That's the only way to make reads work efficiently when you have a small working set relative to the log. <br /><br />Still, I think history may prove Pat Helland right, when he said that current DBMS contents are a cache of the tail of the log (http://blogs.msdn.com/b/pathelland/archive/2007/06/14/accountants-don-t-use-erasers.aspx). One obvious way to make this work is using RAM and SSD, which have also dropped in price quite substantially. I'm not an internals expert but the MVCC view could work like a secondary index in that it's derived rather than primary. The economics are going in a direction where people will keep trying this until they figure out how to get it to work. That's really my point.Robert Hodgeshttps://www.blogger.com/profile/05379726998057344092noreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-86455590604675908042014-02-18T06:09:29.528-08:002014-02-18T06:09:29.528-08:00Accounting systems, since at least the first RDBMS...Accounting systems, since at least the first RDBMS versions (post-COBOL), are in fact based on immutable data. Prior to that, one would "close the month", which would wipe out the month-to-close ledger/journal/register (different names for the same thing) transaction "files", and post the aggregates as line items that month's entry in the yearly "file".<br /><br />With RDBMS based accounting, clients had the option (they could continue to destructively close if they wanted). The problem was/is that RBAR processing each and every time is a pain. With SSDs, multi-processor, large RAM machines, it is getting feasible to store only the transaction table(s) and dispense with aggregates.<br /><br />Anyone who uses a MVCC database knows how slow keeping everything around can be.Robert Younghttps://www.blogger.com/profile/09056808374481236610noreply@blogger.comtag:blogger.com,1999:blog-768233104244702633.post-37496496152160419632014-02-17T20:48:28.806-08:002014-02-17T20:48:28.806-08:00Immutable data is fine for stuff that doesn't ...Immutable data is fine for stuff that doesn't have to be updated, which by definition makes it immutable. <br /><br />Using an immutable store for data which actually is updated can be costly because it is generally necessary to retrieve many versions of a row when processing queries, which can reduce performance.<br /><br />It is possible to create an append-only log-structured table in any database. You can then use your own version and pruning logic, and maintain summary tables that contain only the newest picture of rows and other options. <br /><br />This allows you to roll your own "flash back query" for example. <br /><br />Because the tables are log structured, you don't need separate log tables for the changes to the tables themselves, so maintaining materialized views and rolling things forward/backward is relatively straightforward.Anonymousnoreply@blogger.com