← Back to team overview

maria-developers team mailing list archive

Updated (by Psergey): Add an option to mysqlbinlog to produce SQL script with fewer roundtrips (37)

 

-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add an option to mysqlbinlog to produce SQL script with fewer
		roundtrips
CREATION DATE..: Fri, 07 Aug 2009, 17:14
SUPERVISOR.....: Monty
IMPLEMENTOR....: 
COPIES TO......: 
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 37 (http://askmonty.org/worklog/?tid=37)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Psergey - Sun, 09 Aug 2009, 12:56)=-=-
High-Level Specification modified.
--- /tmp/wklog.37.old.22083     2009-08-09 12:56:36.000000000 +0300
+++ /tmp/wklog.37.new.22083     2009-08-09 12:56:36.000000000 +0300
@@ -11,3 +11,16 @@
   if (my_b_tell(&cache) != 0)
     my_b_write(&cache,";;",2);
 
+Note: mysqlbinlog already uses 
+ 
+  DELIMITER /*!*/;
+
+so that it can process "multi-statements" like 
+
+  CREATE PROCEDURE ... BEGIN stmt1; stmt2; ... END 
+
+what remains to be done is to print the /*!*/; only when we're about to exceed
+$args[combine-statements] bytes. In all other cases, delimit statements with
+regular semicolon.
+
+

-=-=(Psergey - Sun, 09 Aug 2009, 12:30)=-=-
High Level Description modified.
--- /tmp/wklog.37.old.21090     2009-08-09 12:30:26.000000000 +0300
+++ /tmp/wklog.37.new.21090     2009-08-09 12:30:26.000000000 +0300
@@ -1,6 +1,6 @@
 SQL scripts generated by mysqlbinlog can be slow to load because they have many
 small queries, hence applying the script against a remote server requires a lot
-of roundtrips, and they become a bottleneck.
+of roundtrips, and the network roundtrips become the bottleneck.
 
 This bottleneck can be addressed by having mysqlbinlog combine multiple
 statements into one:
@@ -14,7 +14,7 @@
 
 loading such sql script will require fewer roundtrips. 
 
-The behavior can be controlled using a command line option
+The behaviour can be controlled using a command line option
 
 mysqlbinlog --combine-statements=#
 

-=-=(Psergey - Sun, 09 Aug 2009, 12:24)=-=-
Dependency created: 39 now depends on 37

-=-=(Psergey - Fri, 07 Aug 2009, 17:16)=-=-
High-Level Specification modified.
--- /tmp/wklog.37.old.20454     2009-08-07 17:16:54.000000000 +0300
+++ /tmp/wklog.37.new.20454     2009-08-07 17:16:54.000000000 +0300
@@ -1 +1,13 @@
+Implementation overview:
+
+- At start, print "--delimiter=;;"
+- Modify the start of each print functions as follows
+
+  if (my_b_tell(&cache) - my_start_of_combine_statement) + 
+      estimiated_size_of_log_event) > combine_statement_size)
+    my_b_write(&cache,";;",2);
+
+- And we should end mysqlbinlog with;
+  if (my_b_tell(&cache) != 0)
+    my_b_write(&cache,";;",2);
 



DESCRIPTION:

SQL scripts generated by mysqlbinlog can be slow to load because they have many
small queries, hence applying the script against a remote server requires a lot
of roundtrips, and the network roundtrips become the bottleneck.

This bottleneck can be addressed by having mysqlbinlog combine multiple
statements into one:

+delimiter //
 binlog statement1;
 binlog statement2;
 binlog statement3;
+//
 binlog statement4;

loading such sql script will require fewer roundtrips. 

The behaviour can be controlled using a command line option

mysqlbinlog --combine-statements=#

Where the # is maximum allowed packet length.


HIGH-LEVEL SPECIFICATION:



Implementation overview:

- At start, print "--delimiter=;;"
- Modify the start of each print functions as follows

  if (my_b_tell(&cache) - my_start_of_combine_statement) + 
      estimiated_size_of_log_event) > combine_statement_size)
    my_b_write(&cache,";;",2);

- And we should end mysqlbinlog with;
  if (my_b_tell(&cache) != 0)
    my_b_write(&cache,";;",2);

Note: mysqlbinlog already uses 
 
  DELIMITER /*!*/;

so that it can process "multi-statements" like 

  CREATE PROCEDURE ... BEGIN stmt1; stmt2; ... END 

what remains to be done is to print the /*!*/; only when we're about to exceed
$args[combine-statements] bytes. In all other cases, delimit statements with
regular semicolon.



ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)