maria-developers team mailing list archive
Mailing list archive
Re: MDEV-9423: FTWRL and Binlog checkpoint
Nirbhay Choubey <nirbhay@xxxxxxxxxxx> writes:
[Cc: maria-developers@, please always keep these discussions on the mailing list]
> In Galera cluster, the state transfer scripts perform FTWRL and
> copy data along with the last of all available binlog files to the
> joiner node.
> After MDEV-181, I understand that the binlog checkpoint can be
> in any of the binary log files (and not necessarily the last one).
> This seemingly has caused MDEV-9423, in which the joiner node
> complains of the missing binlog file.
> Now the question is : Is FTWRL not sufficient to ensure that the
> checkpoint is always the last binlog file?
So if I understand correctly, the issue is related to having binlog files
available during XA crash recovery. When the binlog file is rotated, there
is a small window where both the latest and the previous binlog files are
needed for crash recovery. The binlog checkpoint is the earliest binlog file
that is needed for crash recovery, and it can be seen from the binlog
So the problem here is that a copy is made just after binlog rotation, and
Galera only copies the most recent, mostly-empty binlog file, leaving
insufficient information for XA recovery, right?
One option to solve this is to always copy the last two binlog files. While
it is theoretically possible to have the binlog checkpoint more than two
files back, I think it will not occur in practice.
Another option is to wait for the binlog checkpoint to reach the current
binlog file. You can see this done in the test suite:
The binlog checkpointing happens asynchroneously, I *think* it can complete
even while FTWRL is active, but I am not 100% sure though.
The checkpoint happens after InnoDB has made its commits durable with
fsync() or similar - only after that is it safe to discard the old binlog
data and still have correct crash recovery.