maria-developers team mailing list archive
Mailing list archive
Re: MDEV-9423: FTWRL and Binlog checkpoint
Nirbhay Choubey <nirbhay@xxxxxxxxxxx>
Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Thu, 23 Jun 2016 09:39:31 +0200
MariaDB Developers <maria-developers@xxxxxxxxxxxxxxxxxxx>
<CACAc7V=NxPSxmUE_s0Ex_7gdt0xQ2orMOnn8Ke-U3EJk2jrfsg@mail.gmail.com> (Nirbhay Choubey's message of "Wed, 22 Jun 2016 21:34:30 -0400")
Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux)
Nirbhay Choubey <nirbhay@xxxxxxxxxxx> writes:
> While copying the last 2 binlog files would have solved this, I have worked
> a solution where the donor node waits for binlog checkpoint event for last
> file to get logged before proceeding with file transfer.
Urgh, please don't do this, seems there are multiple problems with this
patch (insufficient locking, introducing a new redundant wait mechanism,
comparing binlog file names rather than ids, ...).
> By the way, I initially tried reusing
> is_xidlist_idle_nolock()/COND_xid_list to implement the
> waiting mechanism. But since binlog checkpoint events are written
> asynchronously after
> xid_count falls to 0, that did not work. So later came up with the above
I think it should work if you follow the chained locking of LOCK_xid_list
and LOCK_log. First wait under LOCK_xid_list for the binlog_xid_count_list
to become empty. Then release LOCK_xid_list and take and immediately release
LOCK_log. mark_xid_done() will hold onto LOCK_log until the checkpoint event
has been written.
Note that there is already a similar wait mechanism, used by RESET
MASTER. RESET MASTER also needs to wait for checkpoint events to be
completed before running, so we should reuse that mechanism.
Also, it seems reasonable that FTWRL in general could wait for checkpoint
events so that other backup mechanisms similarly could avoid binlog files
changing during backup. So please fix this in FTWRL, in 10.2. (If you feel
you need to fix the galera bug in 10.1, you can implement it only for galera
So in more detail, here is suggested way to fix:
In FTWRL (somewhere near the end, after commits are blocked), wait for
checkpoint events to be written using a similar mechanism as RESET MASTER:
LOCK_xid_list and LOCK_log are chained, so the LOCK_log will only be
obtained after mark_xid_done() has written the last checkpoint event.
Now, since FTWRL is a bit different from RESET MASTER, we need a couple
- Use mysql_cond_broadcast(&COND_xid_list) instead of mysql_cond_signal()
in mark_xid_done() (to allow multiple waiters).
- The second (but not the first mysql_cond_broadcast() in mark_xid_done()
should be unconditional, so remove the if() here:
- Also add mysql_cond_broadcast(&COND_xid_list) in two other places that
the binlog_xid_count_list is modified. One in MYSQL_BIN_LOG::open():
while ((b= binlog_xid_count_list.head()) && b->xid_count == 0)
And one in reset_logs():
This should make FTWRL wait for all pending binlog checkpoint events to be
written. And with commits blocked, no new checkpoints should become pending.
Does it seem reasonable to you? Let me know if some things are unclear or if
you see any potential problems with it.
By the way, how to you intend to handle the case where RESET MASTER is run
during SST? I just checked, FTWRL does not seem to block RESET MASTER. Or do
you have another mechanism to prevent RESET MASTER from running during SST?
Thinking more, you should be holding LOCK_log while copying the binlog
files (I'm guessing your not currently, right?) This will block RESET
MASTER, and it also makes the extra lock/unlock of LOCK_log above redundant.
Also, FTWRL has really complex semantics. You should get Monty's opinion (or
maybe Serg?) on whether there are any potentials for deadlocks to waiting
inside FTWRL for binlog checkpoints.