← Back to team overview

maria-developers team mailing list archive

Re: New problem with GTID patch

 

Pavel Ivanov <pivanof@xxxxxxxxxx> writes:

> of 10.0.1 tarball. It works okay in 10.0-mdev26 though, so I'm not
> sure if you'll want to look at the problem now.

Yes, I would like to look. It could easily be a hidden problem in 10.0-mdev26
too, and I'll have to merge to 10.0 soon anyway.

> Unfortunately with your recent commit (revision 3546) test
> rpl.rpl_gtid_startpos doesn't work when GTID patch is applied on top

> In particular the problematic addition is
>
> if (!is_relay_log && read_state_from_file())
>     DBUG_RETURN(1);

We remember the last GTIDs written into previous binlog files, and
save/restore this across server restarts. read_state_from_file() reads it in
the first time a binlog file is opened. If we read the wrong thing, like empty
file, then we get the behaviour you observed. The state is empty, we log empty
gtid list event, this causes gtid_find_binlog_file() to find the wrong place
to start.

You might want to try revision 3547. It fixes a stupid mistake where after
crash recovery, I forgot to set the flag to not load the state from the file,
so state was imemdiately overwritten. But since crash recovery is not involved
here, I do not see how this would cause your failure, unless it is somehow
multiple test cases interacting with each other.

Can I get your 10.0-based tree somewhere to try and repeat it? (Or just the
current patch against 10.0.1 tarball).

> to function MYSQL_BIN_LOG::open(). Without it test works. With it test
> breaks in the section "Test that master gives error when slave asks
> for empty gtid pos and binlog files have been purged". The problem is
> when get_gtid_list_event() is called from gtid_find_binlog_file() it
> returns event with count == 0. It causes contains_all_slave_gtid() to

This strongly indicates that we are wrongly re-loading stale data from the
state file, overwriting the correct in-memory state.

> everything that should happen here that much. If you want me to look
> at something else to better understand the problem let me know.

If you can add printouts to error log of all calls to read_state_from_file()
and all GTIDs passed to rpl_binlog_state::update(), then that may give a hint
to where we overwrite the binlog state.

 - Kristian.


Follow ups

References