← Back to team overview

maria-developers team mailing list archive

Re: MDEV-225: Replace with dummy events an event that is not understood by a slave to which it should be sent


Hi Serg, thanks for review!

Sergei Golubchik <serg@xxxxxxxxxxxx> writes:

> On Apr 27, knielsen@xxxxxxxxxxxxxxx wrote:

>>   Add that slaves set a variable @mariadb_slave_capability to inform
>>   the master in a robust way about which events it can, and cannot,
>>   handle.

> I'm not sure I like the idea of magic user variables. Why wouldn't you
> create a session server variable @@mariadb_slave_capability?

I'll admit I did not think much about it, I just did it the same way that
other replication features are announced by slaves to the master,
eg. @master_binlog_checksum or @master_heartbeat_period.

There is some merit in being consistent with current code, but there is also
merit in avoiding magic user variables. If you prefer a session server
variable @@mariadb_slave_capability, I will change it?

> Also: Why would "a later release of a slave change what an earlier
> release of a master believes about its capabilities"?

Agree this is unclear. The basic idea is just to argue that it is better that
the slave announces its capabilities explicitly rather than the master tries
to guess based on version.

What I meant here was: if the master used the slave version to guess what the
slave is and is not capable of, then that guess will be fixed. So whenever we
release a new version of the slave, it will not be possible to influence what
the old master guesses (unless we change the slave version number, which is
not always possible/desirable). For example, we could not backport a slave
feature to an older version number and work with old master.

I re-worded the description in the Jira task to this to clarify:

"The slave will, when connecting, set a user variable
{{@mariadb_slave_capability}}. The master will read the value of this variable
and use it to determine the capabilities of the slave. This is more robust and
flexible than relying on version numbers, since ideally any version master
should be able to replicate against any version slave. When we release the
code in a master, it is hard to predict what slave capabilities every possible
version of future releases will have - especially with multiple forks/branches
of MySQL being developed individually. And announcing the slave capabilities
explicitly allows the slave code full flexibility in working correctly against
older masters."

>> +--echo # Test slave with no capability gets dummy event, which is ignored.
>> +--source include/stop_slave.inc
>> +SET @@global.debug_dbug='d,simulate_slave_capability_none';
> Don't do that please, it'll stop any debug trace for ./mtr --debug.
> Instead, write
> SET @@global.debug_dbug='+d,simulate_slave_capability_old_53';
>  ... your tests
> SET @@global.debug_dbug='-d,simulate_slave_capability_old_53';
> or
> SET @debug_old=@@global.debug_dbug;
> SET @@global.debug_dbug='+d,simulate_slave_capability_old_53';
>  ... your tests
> SET @@global.debug_dbug=@debug_old;

Yes, this is much better, thanks for the tip!

>> +  if (data_len < min_query_event_len)
>> +  {
>> +    /*
>> +      Have to use dummy user_var event for such a short packet.
>> +
>> +      This works, but the event will be considered part of an event group with
>> +      the following event, so things like START SLAVE UNTIL may behave slightly
>> +      unexpected.
> Why?

I'll reword to:

"This works, but the event will be considered part of an event group with
the following event. So for example @@global.sql_slave_skip_counter=1
will skip not only the dummy event, but also the immediately following

I am not too happy about this behaviour, but I did not find a better way. On
the other hand - this is unlikely to occur often in practise (if ever),
because most events will be large enough to be replaced by a dummy query event
instead. And besides, this is only for replicating new slave against old
master, which is not "officially" supported anyway. And in any case, it is
surely better than sending an event to the slave that slave does not
understand at all and causes replication to break.

>> +  /* Announce MariaDB slave capabilities. */
>> +  DBUG_EXECUTE_IF("simulate_slave_capability_none", goto after_set_cabability;);
>> +  {
>> +    int rc= DBUG_EVALUATE_IF("simulate_slave_capability_old_53",
>> +        mysql_real_query(mysql, STRING_WITH_LEN("SET @mariadb_slave_capability="
>> +                                STRINGIFY_ARG(MARIA_SLAVE_CAPABILITY_ANNOTATE))),
>> +        mysql_real_query(mysql, STRING_WITH_LEN("SET @mariadb_slave_capability="
>> +                                STRINGIFY_ARG(MARIA_SLAVE_CAPABILITY_MINE))));
> Better put DBUG_EVALUATE_IF only around the STRINGIFY_ARG values:
> so that next time I (or somebody) wouldn't need to vdiff two long
> function calls, trying to count parentheses and commans :)

That is not possible, I'm afraid. DBUG_EVALUATE_IF() expands to an
expression (XXX ? YYY : ZZZ) - but preprocessor needs a constant to
concatenate two strings "foo" "bar".

And STRING_WITH_LEN(...) expands to two expressions with a comma between them
UUU, VVV - so I also cannot put DBUG_EVALUATE_IF() around STRING_WITH_LEN().

I rewrote it to use strlen() instead of STRING_WITH_LEN - this code is not
performance critical:

    const char *q=
                       "SET @mariadb_slave_capability="
                       "SET @mariadb_slave_capability="
    if (mysql_real_query(mysql, q, strlen(q)))

>> === modified file 'sql/sql_repl.cc'
>> --- a/sql/sql_repl.cc	2012-03-28 17:26:00 +0000
>> +++ b/sql/sql_repl.cc	2012-04-27 08:20:38 +0000
>> @@ -563,14 +584,44 @@ static int send_heartbeat_event(NET* net
>>  static const char *
>>  send_event_to_slave(THD *thd, NET *net, String* const packet, ushort flags,
>>                      Log_event_type event_type, char *log_file_name,
>> -                    IO_CACHE *log)
>> +                    IO_CACHE *log, int mariadb_slave_capability,
>> +                    ulong ev_offset, uint8 current_checksum_alg)
> current_checksum_alg is part of THD, why do you bother to pass is separately?

No, this is a different value - this is the checksum algorithm used for the
binlog file currently being read, which is independent of what is in THD.

>> +    else
>> +    {
>> +      /*
>> +        The slave does not understand ANNOTATE_ROWS_EVENT.
>> +
>> +        Older MariaDB slaves (and MySQL slaves) will break replication if there
>> +        are holes in the binlog stream (they will miscompute the binlog offset
>> +        and request the wrong position when reconnecting).
>> +
>> +        So replace the event with a dummy event of the same size that will be
>> +        a no-operation on the slave.
>> +      */
>> +      if (Query_log_event::dummy_event(packet, ev_offset, current_checksum_alg))
>> +        return "Failed to replace row annotate event with dummy: too small event.";
> No ER_xxx error message for that?

Again, I have to admit I did not consider it, I just followed what
the existing replication code was already doing (which is to set

But now that I do consider it - I do not think it is worth it to add a new
error code for this. I believe this is an impossible condition - the limit is
6 bytes, and I do not think it is possible for a statement of less than 6
bytes to cause a row-based binlog event to be logged (the shortest I can think
of is 6 bytes - "CALL f").

I can add a DBUG_ASSERT(), if you like?

 - Kristian.

Follow ups