← Back to team overview

maria-developers team mailing list archive

Re: slave_ddl_exec_mode and incompatible change in MariaDB 10.0.8

 

Hi!

>>>>> "Pavel" == Pavel Ivanov <pivanof@xxxxxxxxxx> writes:

Pavel> And now I found that this change is actually buggy. It turns out that
Pavel> when slave executes a standalone CREATE TABLE event now it will set
Pavel> OPTION_BEGIN flag in thd->variables.option_bits and won't reset it. I
Pavel> don't know whether slave keeps transaction actually not committed
Pavel> and/or whether it doesn't clean up some other transaction data, but
Pavel> execution of the next event will always think there is a transaction
Pavel> open and it needs to be auto-committed.

I checked my patch, but I could not find any cases where I had added
setting OPTION_BEGIN, except in connection with OPTION_GTID_BEGIN.
OPTION_GTID_BEGIN is only set when we *know* that there will be a
COMMIT event following in the log.

I also try to verfiy this by running a test that does this on the master:

"create table t2 (a int) engine=myisam"

I added a breakpoint for the slave in
"mysql_create_table"

Neiter when the function was entered or exited was the OPTION_BEGIN
flag set.

Can you give me an example of where things goes wrong, preferably with
an extract from the binary log that shows what is actually logged.

For example, here is how a normal create table is logged.
(From suite/rpl/r/create_or_replace_row.result)

slave-bin.000001        #       Gtid    #       #       GTID #-#-#
slave-bin.000001        #       Query   #       #       use `test`; create table t2 (a int) engine=myisam
slave-bin.000001        #       Gtid    #       #       BEGIN GTID #-#-#

The GTID above should not set OPTON_BEGIN or OPTION_GTID_BEGIN on the
slave.

However a CREATE ... SELECT will look like:

master-bin.000001       #       Gtid    #       #       BEGIN GTID #-#-#
master-bin.000001       #       Query   #       #       use `test`; CREATE TABLE
 `t1` (
  `f1` int(1) NOT NULL DEFAULT '0'
)
master-bin.000001       #       Table_map       #       #       table_id: # (tes
t.t1)
master-bin.000001       #       Write_rows_v1   #       #       table_id: # flag
s: STMT_END_F
master-bin.000001       #       Query   #       #       COMMIT

The above will set the OPTION_BEGIN and OPTION_GTID_BEGIN for the
CREATE STATEMENT and this will be reset by the COMMIT (that is
guaranteed to follow).

Pavel> But that also means that this
Pavel> state cannot be distinguished from the case when slave received BEGIN
Pavel> event, but didn't receive COMMIT event, i.e. either binlog on master
Pavel> is corrupted or slave somehow skipped some events.

- Corrupted binary logs should not be a concern.  In this case the
  binary log can contain anything, including wrong DROP DATABASE
  commands that could do anything.
- If the master fails, the slave will notice this because it finds a
  'binlog start event', which will reset the BEGIN bits.
- In other words, there will always be a COMMIT event (either explicit
  or implicite, like with a binlog start event)
- The slave can only skip events with slave_skip_counter, but in this
  case it will not be in BEGIN mode. During slave_skip_counter COMMIT
  events will be noticed and the bit will be reset.

How can the binlog be corrupted?
How do you expect the master to handle corruption?
Why is CREATE TABLE a special case you are concerned about, compared
to other things like DELETE FROM TABLE in row based replication?
(DELETE FROM expect a BEGIN, table_id, many delete-row-events, COMMIT).

Pavel> Would MariaDB consider this as a serious problem?

Please show me a test case first so that I can understand the problem.

Regards,
Monty


Follow ups

References