maria-developers team mailing list archive

Thread
Date
Review of patch for MDEV-4820

To: Pavel Ivanov <pivanof@xxxxxxxxxx>, maria-developers@xxxxxxxxxxxxxxxxxxx
From: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Date: Mon, 12 Aug 2013 13:59:04 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux)
I took a close look at your patch for MDEV-4820.

I think there is a fundamental disconnect. In MariaDB GTID, I do not require
or rely on monotonically increasing seqeunce numbers (monoticity is requred
per-server-id, but not between different servers). Nor do I enforce or rely on
absence of holes in the sequence numbers.

This decision was a hard one to make and I spent considerable thought on this
point quite early. It is true that this design reduces possibilities to detect
some kinds of errors, like missing events and alternate futures.

I can understand if this design is not optimal for what you are trying to
do. However, implementing two different designs (eg. based on value of
gtid_strict_mode) is not workable. I believe at least for the first version,
the current design is what we have to work with.

The gtid_strict_mode is about helping the user by giving an error instead of
logging certain undeirable event sequences to the binlog. It should not cause
different non-error behaviour ever, nor should it prevent working with binlogs
that were logged when strict mode was not enabled.

> diff --git a/sql/rpl_gtid.cc b/sql/rpl_gtid.cc
> --- a/sql/rpl_gtid.cc
> +++ b/sql/rpl_gtid.cc
> @@ -993,7 +993,7 @@ rpl_binlog_state::check_strict_sequence(uint32 domain_id, uint32 server_id,
>  
>    if ((elem= (element *)my_hash_search(&hash,
>                                         (const uchar *)(&domain_id), 0)) &&
> -      elem->last_gtid && elem->last_gtid->seq_no >= seq_no)
> +      elem->last_gtid && elem->last_gtid->seq_no + 1 != seq_no)

If I understand this part correctly, this is about enforcing that there are no
holes in sequence numbers. This goes against the GTID design.

> @@ -854,8 +857,16 @@ contains_all_slave_gtid(slave_connection_state *st, Gtid_list_log_event *glev)
>        */
>        return false;
>      }
> -    if (gtid->server_id == glev->list[i].server_id &&
> -        gtid->seq_no <= glev->list[i].seq_no)
> +    if (slave_gtid_strict_mode)
> +    {
> +      if (gtid->seq_no < glev->list[i].seq_no)
> +      {
> +        has_greater_seq_no= true;
> +        break;
> +      }
> +    }

I think here you want to depend on sequence numbers being monotonic. This does
not work, as you cannot be sure that strict mode was enabled when this binlog
file was written.

> -    if (!rpl_global_gtid_slave_state.domain_to_gtid(slave_gtid->domain_id,
> -                                                    &master_replication_gtid) ||
> +    bool master_repl_gtid_found =
> +        rpl_global_gtid_slave_state.domain_to_gtid(slave_gtid->domain_id,
> +                                                   &master_replication_gtid);
> +    if (!master_repl_gtid_found ||
>          slave_gtid->server_id != master_replication_gtid.server_id ||
>          slave_gtid->seq_no != master_replication_gtid.seq_no)
>      {

Right, so slave requests something in a domain that we do not have anything
about in the binlog, but we do have something in the slave state (but not the
correct one).

So this is a bug in the current code. If slave requests to start at a position
in a domain we have nothing in, either that domain is served by a different
master, or slave wants to start from the very first event in that
domain. Currently, we fail to check in the second case that we actually have
the correct GTID as the first event.

But this is the wrong way to fix it. A master should not be prevented from
serving some domain D1 just because it has some old junk for another domain D2
in the slave state.

Instead, when we allow to start in a domain that is missing, and then later
actually encounter an event in this domain, we should check that it matches
what the slave originally requested. If not, we should give an error. I'll try
to find time to do this.

> @@ -1191,6 +1239,16 @@ gtid_find_binlog_file(slave_connection_state *state, char *out_name,
>              */
>              state->remove(gtid);
>            }
> +          else if (slave_gtid_strict_mode &&
> +                   gtid->seq_no == glev->list[i].seq_no)
> +          {
> +            *errormsg= "Requested slave GTID state have different server id "
> +                       "than the one in binlog";
> +            *error_gtid= *gtid;
> +            *good_gtid= glev->list[i];
> +            error= ER_SLAVE_IS_FROM_ALTERNATE_FUTURE;
> +            goto end;
> +          }

This can fail if the binlog file was written when strict mode was not enabled.

>  
> @@ -1559,7 +1620,39 @@ send_event_to_slave(THD *thd, NET *net, String* const packet, ushort flags,
>                event_gtid.seq_no <= gtid->seq_no)
>              *gtid_skip_group = (flags2 & Gtid_log_event::FL_STANDALONE ?
>                                  GTID_SKIP_STANDALONE : GTID_SKIP_TRANSACTION);
> -          if (event_gtid.server_id == gtid->server_id &&
> +          if (slave_gtid_strict_mode)
> +          {
> +            if (event_gtid.seq_no == gtid->seq_no &&
> +                event_gtid.server_id != gtid->server_id)
> +            {
> +              my_errno= ER_SLAVE_IS_FROM_ALTERNATE_FUTURE;
> +              *error_gtid= *gtid;
> +              *good_gtid= event_gtid;
> +              return "Requested slave GTID state have different server id "
> +                     "than the one in binlog";
> +            }

Again, what if this is an old part of the binlog where strict mode was disabled?


> +            else if (*first_event && event_gtid.seq_no > gtid->seq_no)
> +            {
> +              /*
> +                Earlier we decided that this binlog file has event that we need,
> +                but the first event in the file has seq_no beyond what we need.
> +                This can happen in two cases:
> +                 - this file is the first binlog we have and Gtid_list event
> +                   in it is empty;
> +                 - there is a gap between the last event mentioned in Gtid_list
> +                   and the first event in the binlog file.
> +                As the first reason is the most probable one let's issue the
> +                appropriate error message.
> +               */
> +              my_errno= ER_MASTER_FATAL_ERROR_READING_BINLOG;
> +              *error_gtid= *gtid;
> +              return "Could not find GTID state requested by slave in any "
> +                     "binlog files. Probably the slave state is too old and "
> +                     "required binlog files have been purged";
> +            }
> +          }

This "first_event" (which I admit I do not fully understand) looks wrong - all
such things usually need to be per-domain.

However, I think this is the place in the code I mention above, where we get
an event in a domain that the slave requested, but we did not have anything at
connect time. So here I want an even stricter check, and also in non-strict
mode: The first GTID in the domain must be _equal_ to what the slave
requested, or we should fail.

----

Next I will take a closer look at your test cases and see what can be done to
fix it in a way compatible with the basic design.

I also want to re-iterate the suggestion to not delete binlogs when cloning a
master to a new master. The most recent binlog file has real information (in
the initial Gtid_list event) that is needed for proper operation as a master,
and it really doesn't make sense to first purposely corrupt the server state
and subsequently try to hack the code to fix the damage. If one really wanted
not to copy any binlog files, then a better idea would be a way to restore the
state, eg. a RESET MASTER TO gtid_list='xxx,yyy,...' command or something like
that.

Hope this helps,

 - Kristian.
Follow ups

Re: Review of patch for MDEV-4820
From: Jan Lindström, 2013-08-13
Re: Review of patch for MDEV-4820
From: Pavel Ivanov, 2013-08-12