maria-developers team mailing list archive
Mailing list archive
Re: MDEV-9423: FTWRL and Binlog checkpoint
On Thu, Jun 23, 2016 at 3:39 AM, Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
> Nirbhay Choubey <nirbhay@xxxxxxxxxxx> writes:
> > While copying the last 2 binlog files would have solved this, I have
> > out
> > a solution where the donor node waits for binlog checkpoint event for
> > binlog
> > file to get logged before proceeding with file transfer.
> > http://lists.askmonty.org/pipermail/commits/2016-June/009483.html
> Urgh, please don't do this, seems there are multiple problems with this
> patch (insufficient locking, introducing a new redundant wait mechanism,
> comparing binlog file names rather than ids, ...).
> > By the way, I initially tried reusing
> > is_xidlist_idle_nolock()/COND_xid_list to implement the
> > waiting mechanism. But since binlog checkpoint events are written
> > asynchronously after
> > xid_count falls to 0, that did not work. So later came up with the above
> I think it should work if you follow the chained locking of LOCK_xid_list
> and LOCK_log. First wait under LOCK_xid_list for the binlog_xid_count_list
> to become empty. Then release LOCK_xid_list and take and immediately
> LOCK_log. mark_xid_done() will hold onto LOCK_log until the checkpoint
> has been written.
> Note that there is already a similar wait mechanism, used by RESET
> MASTER. RESET MASTER also needs to wait for checkpoint events to be
> completed before running, so we should reuse that mechanism.
> Also, it seems reasonable that FTWRL in general could wait for checkpoint
> events so that other backup mechanisms similarly could avoid binlog files
> changing during backup. So please fix this in FTWRL, in 10.2. (If you feel
> you need to fix the galera bug in 10.1, you can implement it only for
> in 10.1).
That sound good to me. But, considering Percona's backup locks, it seems
more logical to
implement this in Backup locks instead, whenever they get
ported/implemented in MariaDB.
Also, in this particular case, the problem lies
(executed after FTWRL while preparing for SST) that rotates the binary log.
So, FTWRL is not
directly linked to this issue. And as you rightly pointed, I will refrain
from altering FTWRL's
behavior in 10.1 at least.
> So in more detail, here is suggested way to fix:
> In FTWRL (somewhere near the end, after commits are blocked), wait for
> checkpoint events to be written using a similar mechanism as RESET MASTER:
> if (mysql_bin_log.is_open())
> for (;;)
> if (binlog_xid_count_list.is_last(binlog_xid_count_list.head()))
> mysql_cond_wait(&COND_xid_list, &LOCK_xid_list);
> LOCK_xid_list and LOCK_log are chained, so the LOCK_log will only be
> obtained after mark_xid_done() has written the last checkpoint event.
> Now, since FTWRL is a bit different from RESET MASTER, we need a couple
> other changes:
> - Use mysql_cond_broadcast(&COND_xid_list) instead of mysql_cond_signal()
> in mark_xid_done() (to allow multiple waiters).
> - The second (but not the first mysql_cond_broadcast() in mark_xid_done()
> should be unconditional, so remove the if() here:
> if (unlikely(reset_master_pending))
> - Also add mysql_cond_broadcast(&COND_xid_list) in two other places that
> the binlog_xid_count_list is modified. One in MYSQL_BIN_LOG::open():
> while ((b= binlog_xid_count_list.head()) && b->xid_count == 0)
> And one in reset_logs():
> This should make FTWRL wait for all pending binlog checkpoint events to be
> written. And with commits blocked, no new checkpoints should become
> Does it seem reasonable to you? Let me know if some things are unclear or
> you see any potential problems with it.
Yes, it worked. But, to solve this issue in 10.1, I have added this wait to
(as explained above) only when the server is acting as a Galera node.
> By the way, how to you intend to handle the case where RESET MASTER is run
> during SST? I just checked, FTWRL does not seem to block RESET MASTER. Or
> you have another mechanism to prevent RESET MASTER from running during SST?
> Thinking more, you should be holding LOCK_log while copying the binlog
> files (I'm guessing your not currently, right?)
You are right.
> This will block RESET
I am now taking LOG_log during the duration of file transfer as protection
against the above commands.
> and it also makes the extra lock/unlock of LOCK_log above redundant.
Not quite. The wait logic (that includes LOCK_log, as the snippet above) is
REFRESH_BINARY_LOG and an additional use of LOCK_log to block the RESET/
FLUSH commands while file transfer is in progress.
> Also, FTWRL has really complex semantics. You should get Monty's opinion
> maybe Serg?) on whether there are any potentials for deadlocks to waiting
> inside FTWRL for binlog checkpoints.
As explained above, FTWRL remains unchanged, but will still check if
can take a look at the fix.
> - Kristian.