← Back to team overview

maria-developers team mailing list archive

Re: Fwd: possible bug in MySQL 5.6.7-rc slave replication crash safety?

 

Zardosht Kasheff <zardosht@xxxxxxxxx> writes:

> Thank you for the reply. I want to make sure I understand this
> correctly. Is the planned design for crash safe replication to be that
> the slave must have the binary log enabled, so that slaves may use the
> GTID and XA to ensure that the slave and master are in sync? If the
> binary log is not enabled, then the slaves are not crash safe. Is my
> understanding correct?

I know of three different features that touch on "crash safe slave":

1. MDEV-26. If/when this is implemented, the plan is that the state is made
crash-safe using the binary log, as you said.

2. The MySQL feature, where we store in a transactional table the info that is
now in relay-log.info.

3. The XtraDB hack, where InnoDB overwrites the relay-log.info at startup from
internal data salvaged from the InnoDB redo log.

Some comments:

 - None of these make replication slaves truly "crash safe". DDL for example
   is still not crash safe in any MySQL/MariaDB version.

 - (1) and (2) work for any storage engine. (3) is an innodb-only hack.

 - As far as I understand, (1) is also what is used in MySQL 5.6 global
   transaction ID. However, they may have modified the design since I last
   looked.

 - The option (2) makes it harder to implement parallel replication. The
   problem is that two transactions running in parallel could get row lock
   conflicts on the transactional table. One way to solve this is to insert
   new rows with every commit (and use SELECT MAX(...) to get the state). Then
   we need a garbage-collection thread to periodically remove old rows.

 - The option (1) is the nicest from a design point, as it reuses the existing
   mechanism for recovering consistently after a crash in an engine-neutral
   way. But the disadvantage is that it is quite hard (read: I doubt it will
   ever happen) to implement it without at least one fsync() per (group)
   commit, due to the requirement for having binlog on slave. In contrast,
   InnoDB in itself is crash-safe (can lose transactions but not become
   inconsistent) with innodb_flush_log_at_trx_commit=0|2.

Hope this helps,

 - Kristian.


Follow ups

References