← Back to team overview

maria-developers team mailing list archive

Re: [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

 

Kristian, thanks for more remarks!

>>> If you “forget" the domain on the upstream server what happens if
>>> there
>>> are downstream slaves?  I think you’ll break replication if they
>>> disconnect
>>> from this box and try to reconnect. Their GTID information will no
>>> longer match.
>>> IMO and if I’ve understood correctly this is broken.
>
> It should not break replication. It is allowed for a slave with GTID
> position 0-1-100,10-2-200 to connect to a master that has nothing in domain
> 10, this is normal.

To me in a sense this is "implicit" IGNORE_DOMAIN_IDS on domains that master
does not have.

>
> I am not sure what the use-case of replicating DELETE DOMAIN to a slave
> would be. Domain deletion does not have a point-in-time property like normal
> transactions, so it does not help to have it replicated inline in the event
> stream. If it has an effect on a slave, this effect occurs only when the
> slave is restarted/reconnected.

The use-case must've been the suspected loss of connectivity by slaves.

>
>>> I really think there’s a need to indicate what domains should be
>>> forgotten/ignored
>
> If CHANGE MASTER ... IGNORE_DOMAIN_IDS is fixed to also ignore the extra
> domains on master upon connect, it is probably a better way to ignore
> domains in many cases. It is persisted (in the slave's master.info), and it
> can be set individually for each slave, which is more flexible (what if one
> slave needs to ignore a domain but another slave needs to replicate
> it?).
>
>>    >KN> The procedure to fix it will then be:
>>    >> 
>>    >> 1. FLUSH BINARY LOGS, note the new GTID position.
>>    >> 
>>    >> 2. Ensure that all slaves are past the problematic point with
>>    >> MASTER_GTID_WAIT(<pos>). After this, the old errorneous binlog files
>>    >> are no
>>    >> longer needed.
>>    >> 3. PURGE BINARY LOGS to remove the errorneous logs.
>>    >> 
>>    >> 4. FLUSH BINARY LOG DELETE DOMAIN d
>
> So this was what I suggested at some point related to MDEV-12012. But
> probably this is not the best suggestion, as I realised later.
>
> 1. In MDEV-12012, two independent masters were originally using the same
> domain id, so their history looks diverged in terms of GTID. This can be
> fixed by injecting a dummy transaction to make them up-to-date with one
> another in that domain.
> Deleting (possibly valuable) part of the history is
> not needed.
>
> 2. Another case, a slave needs to ignore the part of the history on a master
> connected with some domain. IGNORE_DOMAIN_IDS, once fixed, can do this,
> again there is no need to delete possibly valuable history on the master.
>

Right. The feature we've been discussing solely deals with p.3.

> 3. At some point, a domain that was unused for long may no longer appear
> anywhere, _except_ in gtid_binlog_state and gtid_slave_pos. This may
> eventually clutter the output and be an annoyance. The original idea with
> FLUSH BINARY LOGS DELETE DOMAIN was to allow to fix this annoyance by
> removing such domains from gtid_binlog_state once they are no longer needed
> anywhere.
>
> I am not sure my original suggestion of using PURGE LOGS was ever a good
> idea, or is ever needed.

I think it remains as optional which I wrote in my reply last night.

Cheers,

Andrei

>
>  - Kristian.


Follow ups

References