In a recent customer engagement we enabled Tungsten Replicator with a MySQL application that originally ran on a server that did not use replication. QA promptly discovered reports that previously ran in 10 seconds were now running in as many minutes. It turned out that the reports used temp tables to assemble data, and these were being written into the master binlog. This created bloated binlogs and extremely slow reports. We fixed the problem by enabling row replication (i.e., binlog-format=row in my.cnf).
A common DBA response to temp table problems is to try to eliminate them completely, as suggested in the excellent High Performance MySQL, 3rd Edition. (See p. 502.) Elimination is a good philosophy when applications use temp tables to generate updates. However, it does not work for reporting. Temp tables allow you to stage data for complex reports across a series of transactions, then pull the final results into a report writer like JasperReports. This modular approach is easy to implement and maintain afterwards. Eliminating temp tables in such cases can create an unmaintainable mess.
The real solution with report temp tables is to keep them out of the master binlog. Here is a list of common ways to do so. Let me know if you know others.
* Turn off binlog updates. Issue 'SET SESSION SQL_LOG_BIN=0' when generating reports. The downside is that it requires SUPER privilege to set. Also, if you make a code mistake and update normal tables with this setting, your changes will not be replicated.
* Use a non-replicated database. Configure the master my.cnf with binlog-ignore-db as follows to ignore any update (including on temp tables) that is issued when database 'scratch' is the default database:
binlog_ignore_db = scratch
This approach does not require special privileges. However coding errors or connection pool misconfigurations are obvious liabilities. Your application must either connect to the scratch database or issue an explicit use command. Otherwise, temp table operations will be logged, as in the following example:
create temporary table scratch.report1_temp(name varchar(256), entry_time date, exit_time date);
* Use a slave with the binlog disabled. Remove the log-bin option from my.cnf. This works well if you have extra reporting slaves that are caught up. However, it may not work if the reports must be fully up-to-date or you need the ability to promote the slave quickly to a master, in which case the binlog must be enabled.
* Use row replication. You can set row replication at the session level using 'SET SESSION binlog_format=row', which requires SUPER privilege, or overall by setting binlog-format in my.cnf. In this case CREATE TEMPORARY TABLE and updates on temp tables do not appear in the binlog at all. The downside of enabling row replication fully is that it can lead to bloated logs and blocked servers if you have very large transactions. SQL operations like DELETE that affect multiple rows are stored far more compactly in statement replication. Also, reloading mysqldump files can be very slow in row replication compared to statement replication, which can handle block inserts generated by the --extended-insert option.
The proper solution to keep replication from hurting your use of temp tables will vary depending on your application as well as the way you run your site. For my money, though, this is a good example of where row replication really helps and deserves a closer look.
MySQL could use some feature improvements in the area of temp tables and replication. I find it surprising that mixed mode replication does not fully suppress temp table binlog updates. Only row replication does so. Second, it would be great to have a CREATE TABLE option to suppress logging particular tables to the binlog. This would allow applications to make the logging decision at schema design time. Finally, global options to suppress binlogging of specific table types, such as temp tables and MEMORY tables would be useful. Perhaps we will see some of these in future MySQL releases.