← Back to team overview

maria-discuss team mailing list archive

Re: GTID and missing domain

 

Kristian,

Thank you! You helped me on this once before but I _think_ I've finally got it now. At this point domain 0 is removed from all four servers (both slave_pos and binlog_state) and it looks like they're able to connect around to each other as expected. Gives me much more confidence in being able to bounce them around as needed in the future.

Thanks again!

Dan

On 2/27/2021 11:03 AM, Kristian Nielsen wrote:
mariadb@xxxxxxxxxxxxxx writes:

And my primary server has:

gtid_binlog_pos  0-303-67739600,1-303-7363061243,100-303-4338582

                 gtid_binlog_state
0-302-67690294,0-301-67719794,0-303-67739600,1-301-7350472534,1-302-7350381758,1-303-7363061243,100-302-4242958,100-301-4332195,100-303-4338582

set global gtid_slave_pos = '1-303-7360639083,100-303-4337869';
start slave;

Got fatal error 1236 from master when reading data from binary log:
'Could not find GTID state requested by slave in any binlog files.
Probably the slave state is too old and required binlog files have
been purged.

Even though I'm positive there are no domain 0 transactions (again,
hasn't been in service for years).

Yes.

You write that "there are no domain 0 transactions". But from the point of
view of the database, there _are_ domain 0 transactions, even though they
may be long in the past. These are seen in gtid_binlog_pos (and
gtid_binlog_state).

When your slave has the 0-domain in the gtid_slave_pos, the master knows
that the slave is missing no transactions. When you delete the 0-domain from
the slave, this is the same conceptually as saying the slave is missing
_all_ transactions in domain 0, and the master must send them all (or error
out if they have been purged, as here).

In general, when a slave connects, the master needs to send all transaction
in a domain that the slave did not apply yet - otherwise the slave will be
missing transactions and have the wrong data. This holds regardless of how
old those missing transactions might be. If a slave connects two years after
last being active, the system should still give a reasonable error, not
silently let the slave continue with incorrect data.

That is why you get the error.

if I:

FLUSH BINARY LOGS DELETE_DOMAIN_ID=(0)

on the master, would I then be able to connect to it via

set global gtid_slave_pos = '1-303-7360639083,100-303-4337869';

Yes.

With this command, we are re-defining the history of the master to say that
there were never any transactions in domain 0. Therefore, any slave that
connects cannot be missing any such transactions.

Hope this helps,

  - Kristian.


References