← Back to team overview

maria-developers team mailing list archive

Re: [Commits] Rev 4484: MDEV-4255 MIXED replication: Unique key updates are not marked unsafe, replication fails in lp:~maria-captains/maria/10.0

 

Sergei Golubchik <serg@xxxxxxxxxxx> writes:

> Kristian Nielsen, could you please review this patch?

I guess my main question here is, did you think about why you want to) fix
this 1) in a post-GA release, 2) at all?

One problem with adding new statements as unsafe for statement-based
binlogging is that it introduces new warnings in existing applications. This
is especially annoying on a slave with binlogging enabled, as every unsafe
statement now causes a warning in the slave error log. Thus, it can happen
that a security update for a user leaves them with an error log that
constantly fills up on a busy slave. It is my understanding that
statement-based binlogging (as opposed to mixed or row) is still quite
popular, and there have been complaints about these warnings before.

Another point is that this patch seems to fix only a corner case of a bigger
problem? This particular patch is about non-transactional updates. But
updating a unique key can break replication just as well on transactional
tables:

  CREATE TABLE t1 (a INT PRIMARY KEY);
  INSERT INTO t1 VALUES (2), (3), (4);
  UPDATE t1 SET a=a-1;

Since we do not have deferred constraints, this UPDATE will fail or succeed
depending on the order in which rows are updated. So a statement that succeeds
andgets binlogged on the master can break on the slave if the rows are updated
in a different order.

So any reason that you fix one part of UPDATE of unique key but not the other?

(I suppose the same problem can also happen with foreign key constraints,
though it might take some creativity to come up with an example where an
UPDATE will conflict or not with foreign keys depending on row update order).

The bug has fix version 5.5, your patch seems to be against 10.0, where do you
intend to push it? I would probably have chosen to fix in 10.1, if at all, but
I'm not necessarily against 5.5 or 10.0, I'm mostly just asking what the
reason is for whichever version was chosen?

> I don't particularly like that it explicitly invokes
> thd->decide_logging_format() . But there's no way around it, first time
> logging format is decided in lock_tables(), basically when tables are just
> opened. At that point we don't know what fields will be updated yet.

I am not really familiar with how binlogging is decided on the master, have
not worked with that code so far. So unfortunately I am not able to say much
about whether this is a problem or not. At least the call is still before the
actual update of rows starts, so I suppose it should be ok...

> At lp:~maria-captains/maria/10.0
>
> ------------------------------------------------------------
> revno: 4484
> revision-id: sergii@xxxxxxxxx-20141116234900-747qdhm9ti9xh070
> parent: sergii@xxxxxxxxx-20141113161325-mdz5ox7e86iyazss
> fixes bug: https://mariadb.atlassian.net/browse/MDEV-4255
> committer: Sergei Golubchik <sergii@xxxxxxxxx>
> branch nick: 10.0
> timestamp: Mon 2014-11-17 00:49:00 +0100
> message:
>   MDEV-4255 MIXED replication: Unique key updates are not marked unsafe, replication fails

I think this needs a better description, especially since this can cause new
warnings in slave error logs. Here is a suggestion, but feel free to use
something else if you prefer:

"MDEV-4255 MIXED replication: Unique key updates are not marked unsafe, replication fails

Statement-based replication of UPDATE statements that modify (parts of) unique
keys is not safe. If the order of row updates differ between master and slave,
the slave can end up modifying a different set of rows before a unique key
violation error than what was updated on the master. If the table is
non-transactional, then the statement will be partially executed on the master
and on the slave, but with different result, causing replication to diverge.

This patch marks as unsafe for statement-based replication any UPDATE that
modifies a unique key on a non-transactinal table. In mixed-mode replication,
this will cause the query to be logged in row mode. In statement-mode
replication, this will produce a warning about binlogging of unsafe statement."

> === modified file 'sql/share/errmsg-utf8.txt'
> --- a/sql/share/errmsg-utf8.txt	2014-11-13 16:13:25 +0000
> +++ b/sql/share/errmsg-utf8.txt	2014-11-16 23:49:00 +0000
> @@ -7111,3 +7111,5 @@ ER_SLAVE_SKIP_NOT_IN_GTID
>  	eng "When using GTID, @@sql_slave_skip_counter can not be used. Instead, setting @@gtid_slave_pos explicitly can be used to skip to after a given GTID position."
>  ER_TABLE_DEFINITION_TOO_BIG
>          eng "The definition for table %`s is too big"
> +ER_BINLOG_UNSAFE_UPDATE_UNIQUE_NONTRANS
> +        eng "The statement is unsafe because it updates a UNIQUE key colums in a non-transactional table. This is unsafe because in the case of a conflict a non-predictable part of the table will stay updated."

(Adding a new error code in a GA release requires some care, especially if
fix-version:5.5 is intended, but I'm sure you're better aware of this than
most).


> === modified file 'sql/sql_lex.h'
> --- a/sql/sql_lex.h	2014-08-07 16:06:56 +0000
> +++ b/sql/sql_lex.h	2014-11-16 23:49:00 +0000

> @@ -1319,6 +1319,12 @@ class Query_tables_list
>      */
>      BINLOG_STMT_UNSAFE_AUTOINC_NOT_FIRST,
>  
> +    /**
> +       INSERT into auto-inc field which is not the first part of composed
> +       primary key.
> +    */
> +    BINLOG_STMT_UPDATE_UNIQUE_NONTRANS,
> +

Copy-paste error. Here, you copied the comment of the earlier entry, you
surely intended to re-write the comment to describe the newly added entry.

Hope this helps,

 - Kristian.


Follow ups