maria-developers team mailing list archive

Thread
Date
Re: [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)

To: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
From: Simon Mudd <simon.mudd@xxxxxxxxxxx>
Date: Wed, 20 Sep 2017 22:58:04 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx, andrei.elkin@xxxxxxxxxxx
In-reply-to: <87r2vcfht7.fsf@urd.knielsen-hq.org>
Hi Kristian,

Sorry for the late response.

> On 12 Sep 2017, at 10:21, Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx> wrote:
> 
> Simon Mudd <simon.mudd@xxxxxxxxxxx> writes:
> 
>> ids. Obviously once all appropriate bin logs have been purged
>> (naturally by other means) then no special processing will be needed.
> 
> Right. Hence my original idea (which was unfortunately never implemented so
> far). If at some point a domain has been unused for so long that all GTIDs
> in that domain are gone, it is relatively safe to pretend that the domain
> never existed.
> 
> I would like to understand if you can think of significant use cases where
> the DBA needs to have active binlog files in the master containing some
> domain, while simultaneously pretending that this domain never existed.

The only case I can think of would be this:

Imagine a replication chain of M[aster] —> S[lave]1, S[lave]2, A[ggregate]1
and A[ggregate]1 —> A[ggregate]2 , A[ggregate]3, ….

A1 one has all the data of M but also extra databases where aggregate information is
made of the original data in M/S1/…  Additionally it might make sense that writes
to A1 use a different domain id.

If M dies and say A1 happens to be more up to date than S1, S2 then we may want to promote
A1 to be the new master, and move S1, S2 under A1, move A2 under A1 (but promote as the aggregate writeable master),
and move A3 under A2. This would not be the “desired” setup as probably we’d end
up thowing away all the aggregate data on A1.

The topology would end up as:

A1 [ignore any aggregate database/data] -> S1, S2, A2
and A2 —> A3

In this specific case it may be you really do want to hide the 2 sets of domains and only show one
to the S1, S2 boxes, but maintain 2 domains on A2, A3.

I have such a setup but have never had to handle such a failure scenario and it
may be that there are better ways to handle this but one thing I’m sure about:
I wouldn’t want to wipe out the binlogs especially on A1 as they hold valuable information
which may not be stored anywhere else.

Does that answer your question?

> Or if it is more of a general concern, and the inconvenience for users to
> have to save old binlogs somewhere else than the master's data directory and
> binlog index (SHOW BINARY LOGS).

If you have to "save binlogs somewhere” you’re doing stuff manually. If you manage
a lot of servers that’s really undesirable.

...
> 
> I understand the desire to not delete binlog files.
> 
> The problem is: If you want to have GTIDs with some domain in your active
> binlog files, _and_ you also want to pretend that this domain never existed,
> what does it mean? What is the semantics? It creates a lot of complexities
> for defining the semantics, for documenting it, for the users to understand
> it, and for the code to implement it correctly.

Yes. However, it also reflects realities of a slave connecting and trying to get back
in sync to a master which may have “more information” (in terms of GTID info)
than the slave but which is unable to “serve that information” to the slave.

I’ve forgotten now exactly how this is handled by MariaDB but know that in
a situation like that with MySQL GTID the sync just won’t work as you can’t
afaik tell the slave to “not worry about “uuid:X” or “domain:Y” but just try to sync
the rest.  And in MySQL GTID at least injecting millions of empty events to
get the slave’s GTID state inline with the missing UUID would be well ... rather
stupid.

In the perfect world this situation never happens. In the real world it
does and the DBA often has to just live with inconsistencies between data
stored on a master and a slave (which he can fix later) but *let replication flow*.

It depends but in my opinion in most cases letting replication flow is more
important than having 100% master and slave consistency. The longer the
slave is stopped the more differences there are.

And when you get in a situation like this you’re very tempted to go back to
binlog file plus position, to scan the bin logs with tools like mysqlbinlog
and do it the old way like we used to do years ago.  This is tedious and error
prone but if you’re careful it works fine. The whole idea of GTID is to avoid
the DBA ever having to do this…

> So basically, I do not understand what is the intended meaning of FLUSH
> BINARY LOGS DELETE DOMAIN d _and_ at the same time keeping GTIDs with domain
> d around in active binlog files? In what respects is the domain deleted, and
> in what respects not?

expire_logs_days allows you to keep bin logs for a time you’re comfortable with.
So you can restore from an old system and then roll forward and do point in time
recovery or simply catch up with the master. Normally slaves are _well_ ahead of
the oldest bin logs.

You have a copy of the bin logs on the master and if log_slave_updates is enabled
an "equivalent backup” on a number of slaves. You normally don’t expect to use these
files, but they’re there if you need them. If you wipe binlogs earlier than the configure
expiration then you may need to keep those bin logs somewhere else. That’s just more
to manage as the expiry is handled “automatically" for you by MariaDB.

So I see the DELETE DOMAIN (MariaDB) or “remove old UUID” (MySQL) type request
to be one that means the master will only pretend that it can serve or knows about
the remaining domains or UUIDs and if the slaves are sufficiently up to date they
really don’t care as their vision will be similar.  Such a command would be replicated,
right? It has to be for the slaves to change “their view” at the same moment
in replication (not necessarily time) as the master.

You need to be aware that slave’s won’t be able to request back before this point
but that’s our job. This is after all a clean up / housekeeping type problem,
and not something you’ll do much. My guess once every few months or
once a year and that’s if you have a large number of domains or uuids.

So with MariaDB replication I’d not expect this to happen much. For Oracle
GTID since there’s a UUID for each master then master switchovers will add
a new UUID to the list and if you do a few of these a year it seems quite possible
that you may have a gtid_executed containing 20 or 30 UUIDs. That really is messy
and in need of a clean up.  MariaDB’s situation is likely I think to require
a much smaller number of such clean ups but the idea of why
this might be needed I think is the same.

> For the master, the binlog files are mainly used to stream to connecting
> slaves. Deleting a domain means replacing the conceptual binlog history with
> one in which that domain never existed. So that domain will be ignored in a
> connecting slaves position, assuming it is served by another multi-source
> master. If a new GTID in that domain appears later, it will be considered
> the very first GTID ever in that domain.
> 
> So consider what happens if there is anyway GTIDs in that domain deeper in
> the binlog:
> 
> 1. An already connected slave may be happily replicating those GTIDs. If
> that slave reconnects (temporary network error for example), it will instead
> fail with unknown GTID, or perhaps just start silently ignoring all further
> GTIDs in that domain. This kind of unpredictable behaviour seems bad.
> 
> 2. Suppose a slave connects with a position without the deleted domain. The
> master starts reading the binlog from some point. What happens if a GTID is
> encountered that contains the deleted domain? The slave will start
> replicating that domain from some arbitrary point that depends on where it
> happened to be in other domains at the last disconnect. This also seems
> undesirable.
> 
> There may be other scenarios that I did not think about.

The maintenance we’re talking about should happen rather infrequently.
So the situations of it going wrong and “causing confusion” are because
you don’t control your replication well enough or are too eager to remove
domains while adding new ones.  In practice I think this is unlikely to happen
but if it does it will be user error because they did not understand what they
were doing.

Appropriate warnings are definitely in order in the documentation
of such commands and their usage.

> 
>> DBAs do not like to remove bin logs “early" as unless you keep a copy
>> somewhere you may lose valuable information,
>> for recovery, for backups etc. Not everyone will be making automatic
>> copies (as MySQL does not provide an automatic way to do this)
> 
> Understood. Maybe what is needed is a PURGE BINARY LOGS that removes the
> entries from the binlog index (SHOW BINARY LOGS), but leaves the files in
> the file system for the convenience of the sysadmin? (Well, you can just
> hand-edit binlog.index, but that requires master restart I think).

You’re back to doing stuff manually. Please: no. It doesn’t scale.

> 
>> The other comment I see mentioned here was “make sure all slaves are
>> up to date”. That’s going to be hard. The master can only be
>> aware of “connected slaves” and if you have intermediate masters, or a
> 
> Indeed, the master cannot ensure this. The idea is that the DBA, who decides
> to delete a domain, must understand that this should not be done if any
> slave still needs GTIDs from that domain. This is similar to configuring
> normal binlog purge, where the DBA needs to ensure that binlogs are kept
> long enough for the needs of the slowest slave.

Yep. The DBA needs to be smart enough to “not worry about the problem”
or to understand what he’s doing. So documentation needs warnings,
though of course no-one ever reads the docs...

> 
>> FWIW expiring old domains is good to do. There’s a similar FR for
> 
>> completely different the problem space is the same. Coming up with a
>> solution which is simple to use and understand and also
>> avoids where that’s possible making mistakes which may break
>> replication is good. So thanks for looking at this.
> 
> Indeed. And the input from people like you with strong operational
> experience is very valuable to end up with a good solution, hence my request
> for additional input.

Thanks for sharing your idea of things. Most of these things seem unimportant
if you manage one or two servers and can shut them down and reconfigure them
at your own convenience.

However, when you manage more systems and can not shutdown masters the
software (MariaDB/MySQL) needs to help you and make this sort of task an easy
no-brainer which we can forget about, allowing us to get on with other things.
Any help you can provide in solving these problems is certainly appreciated.

Simon
Follow ups

Re: [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)
From: Kristian Nielsen, 2017-09-21
References

Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)
From: andrei . elkin, 2017-09-06
Re: Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)
From: Kristian Nielsen, 2017-09-06
Re: Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)
From: andrei . elkin, 2017-09-08
Re: [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)
From: Simon Mudd, 2017-09-11
Re: [External] Obsolete GTID domain delete on master (MDEV-12012, MDEV-11969)
From: Kristian Nielsen, 2017-09-12