← Back to team overview

maria-developers team mailing list archive

Updated (by Knielsen): mysqlbinlog: remove undesired effects of format description binlog statement (50)

 

-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: mysqlbinlog: remove undesired effects of format description binlog
		statement 
CREATION DATE..: Mon, 17 Aug 2009, 14:12
SUPERVISOR.....: Monty
IMPLEMENTOR....: 
COPIES TO......: 
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 50 (http://askmonty.org/worklog/?tid=50)
VERSION........: Server-9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 2
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Knielsen - Tue, 08 Sep 2009, 14:29)=-=-
High-Level Specification modified.
--- /tmp/wklog.50.old.17531     2009-09-08 14:29:30.000000000 +0300
+++ /tmp/wklog.50.new.17531     2009-09-08 14:29:30.000000000 +0300
@@ -12,3 +12,37 @@
 One question that one needs to sort out before disabling server_id change is 
 why it was put there in the first place?  Should it be always removed?
 
+First, the problem is not the format description log event. In fact all log
+events have their own value of server_id, and every event in the BINLOG
+statement is executed with its own id as server_id.
+
+It seems it was introduced deliberately in the patch for Bug#32407. This patch
+makes sql thread and BINLOG statement share code path for executing the binary
+event, even though the bug is really about something different (outputting
+format description event to allow proper execution of other BINLOG
+statements).
+
+Nevertheless, I think using the server_id in the events of a BINLOG statement
+is wrong:
+
+1. It is different behaviour from executing normal statement-based events not
+written using BINLOG.
+
+2. It causes the possibility for infinite cycle of replication in a cyclic
+replication setup, if the server ids in BINLOG are different from all other
+server ids in the cycle.
+
+3. The functionality of applying events with original server id from the event
+is still available by pointing the slave thread to the binlog. The
+functionality of applying row-based binlog events with server id of the server
+executing the BINLOG statements is not otherwise available.
+
+In fact most of the code from the slave thread that is now also run when
+executing BINLOG statements is redundant, or does the wrong thing (like
+updating binlog position).
+
+Maybe there is a confusion on the purpose of the BINLOG statement. Is it to be
+able to apply row-based events manually to a server (mysqlbinlog|mysql)? Or is
+it to be able to simulate the effect of the slave sql thread? I think the
+former is the most important one.
+

-=-=(Knielsen - Mon, 07 Sep 2009, 10:06)=-=-
Research worklog, bugs, and existing server code.

Worked 2 hours and estimate 0 hours remain (original estimate increased by 2 hours).

-=-=(Psergey - Mon, 17 Aug 2009, 14:13)=-=-
Dependency created: 39 now depends on 50

-=-=(Psergey - Mon, 17 Aug 2009, 14:13)=-=-
High-Level Specification modified.
--- /tmp/wklog.50.old.11389     2009-08-17 14:13:05.000000000 +0300
+++ /tmp/wklog.50.new.11389     2009-08-17 14:13:05.000000000 +0300
@@ -1 +1,14 @@
+First item
+----------
+AFAIU what needs to be done is:
+1. record a source server version (it is in the first binlog event). 
+2. don't emit a BINLOG statement if the recorded version number is 5.0.x or
+  below.
+
+and we'll get a 5.0-applicable binlog.
+
+Second item
+-----------
+One question that one needs to sort out before disabling server_id change is 
+why it was put there in the first place?  Should it be always removed?
 



DESCRIPTION:

According to complaints in BUG#46640:

1. mysqlbinlog will emit a 5.0-incompatible "format description binlog 
statement" even when reading a binary log that was produced by 5.0. 
This will cause an error when one tires to apply mysqlbinlog output to a
5.0.x server.

2. When one applies "format description binlog statement" at the slave, it 
will change the slave's server_id when applied. 

This WL is to fix these issues.


HIGH-LEVEL SPECIFICATION:



First item
----------
AFAIU what needs to be done is:
1. record a source server version (it is in the first binlog event). 
2. don't emit a BINLOG statement if the recorded version number is 5.0.x or
  below.

and we'll get a 5.0-applicable binlog.

Second item
-----------
One question that one needs to sort out before disabling server_id change is 
why it was put there in the first place?  Should it be always removed?

First, the problem is not the format description log event. In fact all log
events have their own value of server_id, and every event in the BINLOG
statement is executed with its own id as server_id.

It seems it was introduced deliberately in the patch for Bug#32407. This patch
makes sql thread and BINLOG statement share code path for executing the binary
event, even though the bug is really about something different (outputting
format description event to allow proper execution of other BINLOG
statements).

Nevertheless, I think using the server_id in the events of a BINLOG statement
is wrong:

1. It is different behaviour from executing normal statement-based events not
written using BINLOG.

2. It causes the possibility for infinite cycle of replication in a cyclic
replication setup, if the server ids in BINLOG are different from all other
server ids in the cycle.

3. The functionality of applying events with original server id from the event
is still available by pointing the slave thread to the binlog. The
functionality of applying row-based binlog events with server id of the server
executing the BINLOG statements is not otherwise available.

In fact most of the code from the slave thread that is now also run when
executing BINLOG statements is redundant, or does the wrong thing (like
updating binlog position).

Maybe there is a confusion on the purpose of the BINLOG statement. Is it to be
able to apply row-based events manually to a server (mysqlbinlog|mysql)? Or is
it to be able to simulate the effect of the slave sql thread? I think the
former is the most important one.


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)