← Back to team overview

maria-developers team mailing list archive

Re: [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

 

Simon, Kristian, salute.

> Simon, thanks for your detailed answer.
>
> I see your point on having access to powerful tools when they are needed,
> even when such tools can be dangerous when used incorrectly. It reminds me
> of the old "goto considered harmful" which I never agreed with.
>
> It occurs to me that there are actually implicitly two distict features
> being discussed here.
>
> One is: Forget this domain_id, it has been unused since forever, but do
> check that it actually _is_ unused to avoid mistakes (the check would be
> that the domain is already absent from all available binlog files). This is
> the one I originally had in mind.
>
> Another is: There is this domain_id in the old binlog files, it is causing
> problems, we need to recover and we know what we are doing. I think this is
> the one you have in mind in what you write, and it seems very valid as well.
>
> It helped me to think of them explicitly as two distinct features.
>
> Also, Andrei's suggestion to fix IGNORE_DOMAIN_IDS to be able to connect to
> a master with that domain and completely ignore it seems useful for some of
> the scenarious you mention.

And this would be for a Master-Slave where Master knows more domains
than Slave.
In a reverse scenario (mdev-12012 is a sort of that) when Slave knows more,
we've also learned (thanks to Kristian, see mdev-12012 latest comments) another method
to ignore *discrepancy* via faking the "problematic" master's binlog state with a
dummy gtid group. After it's done Slave should be able to connect to such Master.

This observation must be relieving as don't have to consider keeping the old
logs with discarded domain's events. The new FLUSH-delete-domain is to
run at the user convenience.

>
>> Imagine a replication chain of M[aster] ―> S[lave]1, S[lave]2, A[ggregate]1
>> and A[ggregate]1 ―> A[ggregate]2 , A[ggregate]3, ….
>
>> If M dies and say A1 happens to be more up to date than S1, S2 then we may want to promote
>> A1 to be the new master, and move S1, S2 under A1, move A2 under A1
>> (but promote as the aggregate writeable master),
>> and move A3 under A2. This would not be the “desired” setup as probably we’d end
>> up thowing away all the aggregate data on A1.
>
> Right, I see. Throwing away table data needs matching editing of the binlog
> history to give a consistent replication state. And indeed in a failover
> scenario, waiting for logs to be purged/purgeable does not seem appropriate.
>
>> In this specific case it may be you really do want to hide the 2 sets
>> of domains and only show one
>> to the S1, S2 boxes, but maintain 2 domains on A2, A3.
>
> Agree. So a fixed IGNORE_DOMAIN_IDS would seem helpful here.

True.

>
>> It depends but in my opinion in most cases letting replication flow is more
>> important than having 100% master and slave consistency. The longer the
>> slave is stopped the more differences there are.
>>
>> And when you get in a situation like this you’re very tempted to go back to
>> binlog file plus position, to scan the bin logs with tools like mysqlbinlog
>> and do it the old way like we used to do years ago.  This is tedious and error
>> prone but if you’re careful it works fine. The whole idea of GTID is to avoid
>> the DBA ever having to do this…
>
> Right. Though once multiple domains are involved, the binlog is effectively
> multiple streams, and using the old-style single file/offset position may be
> tricky.
>
> But if IGNORE_DOMAIN_IDS works for master connection as well, then the slave
> has the ability to say exactly which domains it wants to see, and exactly
> where in each of those domains it wants to start (gtid_slave_pos), so that
> should be quite flexible.
>
> When I designed GTID I actually had this very much in mind, to allow GTID to
> be a full replacement for the old style of replication and to allow to do
> what is needed to solve the problem at hand. For example, this is why the
> code tries so hard to deal with out-of-order GTID sequence numbers (as
> opposed to just refusing to ever operate with those).
>
> On the other hand, it was also a goal to be much more consistent and strict
> and try to prevent silent failures and inconsistencies. These two goals tend
> to get in conflicts in some areas. Hence for example the
> gtid_strict_mode.

I can only add up, the master side (think of fan related metaphor) is
always better be strict.

>
> There are still a few features that were never implemented but should have
> been (like DELETE DOMAIN and binlog indexes for example), and it is surely
> not perfect.
>
>> So I see the DELETE DOMAIN (MariaDB) or “remove old UUID” (MySQL) type request
>> to be one that means the master will only pretend that it can serve or knows about
>> the remaining domains or UUIDs and if the slaves are sufficiently up to date they
>> really don’t care as their vision will be similar.  Such a command would be replicated,
>> right? It has to be for the slaves to change “their view” at the same moment
>> in replication (not necessarily time) as the master.
>
> Hm, good point about whether it will be replicated.
>
> FLUSH LOGS is replicated by default with an option not to, so a DELETE
> DOMAIN would be also, I suppose. This makes it seem even more dangerous,
> frankly. Imagine an active domain being deleted by mistake

So the point is to have a slave that is not affected and can rectify
(e.g with fail-over to it as promoted Master)?

>, now the mistake
> immediately propagates to all servers in the replication topology, ouch.
>
> Maybe there should be an option, for example
>
>   FLUSH BINARY LOGS DELETE DOMAIN 10 NOCHECK
>
> or
>
>   FLUSH BINARY LOGS DELETE DOMAIN 10 ALLOW ACTIVE

Something like this and also the choice between 'NOCHECK' and 'ALLOW
ACTIVE' would be mandatory, that is no replication default for 'DELETE DOMAIN'.
So the user first weighs how much risky it would be replicate.

>
> or something.
> Note that the effect of deleting a domain is basically to add at the head of
> the binlog a mark that says the domain never existed. All of the old binlog
> is unchanged. So the command does not really immediately affect running
> replication, only new slave re-connections.
>
> Hope this helps,
>
>  - Kristian.

Thanks for a good piece of analysis, colleagues!

Cheers,

Andrei


Follow ups

References