maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #05998
Re: Review of patch for MDEV-4820
On Mon, Aug 12, 2013 at 4:59 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
> I took a close look at your patch for MDEV-4820.
>
> I think there is a fundamental disconnect. In MariaDB GTID, I do not require
> or rely on monotonically increasing seqeunce numbers (monoticity is requred
> per-server-id, but not between different servers). Nor do I enforce or rely on
> absence of holes in the sequence numbers.
>
> This decision was a hard one to make and I spent considerable thought on this
> point quite early. It is true that this design reduces possibilities to detect
> some kinds of errors, like missing events and alternate futures.
>
> I can understand if this design is not optimal for what you are trying to
> do. However, implementing two different designs (eg. based on value of
> gtid_strict_mode) is not workable. I believe at least for the first version,
> the current design is what we have to work with.
I wonder what kind of production environment tolerates lost
transactions or alternate futures.
It's really sad to hear that by intentional design MariaDB doesn't fit
well into those environments that don't want to tolerate db
inconsistencies...
> The gtid_strict_mode is about helping the user by giving an error instead of
> logging certain undeirable event sequences to the binlog. It should not cause
> different non-error behaviour ever, nor should it prevent working with binlogs
> that were logged when strict mode was not enabled.
>
>> diff --git a/sql/rpl_gtid.cc b/sql/rpl_gtid.cc
>> --- a/sql/rpl_gtid.cc
>> +++ b/sql/rpl_gtid.cc
>> @@ -993,7 +993,7 @@ rpl_binlog_state::check_strict_sequence(uint32 domain_id, uint32 server_id,
>>
>> if ((elem= (element *)my_hash_search(&hash,
>> (const uchar *)(&domain_id), 0)) &&
>> - elem->last_gtid && elem->last_gtid->seq_no >= seq_no)
>> + elem->last_gtid && elem->last_gtid->seq_no + 1 != seq_no)
>
> If I understand this part correctly, this is about enforcing that there are no
> holes in sequence numbers. This goes against the GTID design.
Just out of curiosity: could tell me what legitimate sequence of
events can lead to hole in sequence numbers?
>> + else if (*first_event && event_gtid.seq_no > gtid->seq_no)
>> + {
>> + /*
>> + Earlier we decided that this binlog file has event that we need,
>> + but the first event in the file has seq_no beyond what we need.
>> + This can happen in two cases:
>> + - this file is the first binlog we have and Gtid_list event
>> + in it is empty;
>> + - there is a gap between the last event mentioned in Gtid_list
>> + and the first event in the binlog file.
>> + As the first reason is the most probable one let's issue the
>> + appropriate error message.
>> + */
>> + my_errno= ER_MASTER_FATAL_ERROR_READING_BINLOG;
>> + *error_gtid= *gtid;
>> + return "Could not find GTID state requested by slave in any "
>> + "binlog files. Probably the slave state is too old and "
>> + "required binlog files have been purged";
>> + }
>> + }
>
> This "first_event" (which I admit I do not fully understand) looks wrong - all
> such things usually need to be per-domain.
You are right, although I'd think it shouldn't be first_event
per-domain. Instead the Gtid_list event should be remembered and then
first seen event should be compared to what was in Gtid_list...
> However, I think this is the place in the code I mention above, where we get
> an event in a domain that the slave requested, but we did not have anything at
> connect time. So here I want an even stricter check, and also in non-strict
> mode: The first GTID in the domain must be _equal_ to what the slave
> requested, or we should fail.
Don't forget the special case when the GTID requested by slave is the
last event in this domain in the previous binlog file. Then you don't
look into that file and start serving directly from the next event
which won't be equal to what slave requested.
> ----
>
> Next I will take a closer look at your test cases and see what can be done to
> fix it in a way compatible with the basic design.
Well, if I understood you correctly all test cases shouldn't work by
design. Maybe only except the second case when server doesn't
replicate at all.
> I also want to re-iterate the suggestion to not delete binlogs when cloning a
> master to a new master. The most recent binlog file has real information (in
> the initial Gtid_list event) that is needed for proper operation as a master,
> and it really doesn't make sense to first purposely corrupt the server state
> and subsequently try to hack the code to fix the damage. If one really wanted
> not to copy any binlog files, then a better idea would be a way to restore the
> state, eg. a RESET MASTER TO gtid_list='xxx,yyy,...' command or something like
> that.
So, in MySQL 5.1 with Google's Group IDs binlog didn't have real
information. I guess we can reevaluate what we should do with MariaDB.
Although it sounds like whatever we do it still won't fit us without
hacks that go against MariaDB's design -- we'll still need to hack to
prohibit any transaction loss or alternate futures...
Pavel
Follow ups
References