← Back to team overview

maria-developers team mailing list archive

Re: MariaDB 10 and MaxScale binlog router

 

Jean-François Gagné <jeanfrancois.gagne@xxxxxxxxxxx> writes:

>>> (2) SELECT binlog_gtid_pos('mst-bin.000001',310)
>> This is used by the slave to obtain the correct GTID position corresponding to
>> the position at which it is starting, when it is connecting in non-GTID mode.
>>
>> When the slave has this information, it becomes easy for the DBA to switch to
>> a new master using GTID:
>>
>>    STOP SLAVE;
>>    CHANGE MASTER TO master_host='new_master', master_use_gtid=slave_pos;
>>    START MASTER;
>>
>> This works even if the slave was not using GTID mode prior to the CHANGE
>> MASTER, thanks to that SELECT binlog_gtid_pos().
>
> Is this really needed ?

Well, it's needed in the general case to be able to get the correct GTID
position to automatically switch to a different master using GTID, as above.

It is not _really_ needed in the sense that the DBA can just manually
SET GLOBAL gtid_slave_pos='<position>' instead. Or the DBA might not have any
need for using GTID in the first place.

One of the primary goals of MariaDB GTID was to make it easy to start using
it, that is why this was implemented.

> My understanding is that the SQL_THREAD will remember the GTID of the
> last executed transaction, which make the GTID provided by "SELECT
> binlog_gtid_pos('mst-bin.000001',310)" quickly obsolete. This SELECT

> There might be a subtility with multiple DomainIDs that I am missing:
> that SELECT might return the GTIDs of all write domains up to that
> position...  Again, this needs to read all the binlog up to that

Correct. The GTID position in the general case has one GTID per replication
domain id. And since some of those domains may have no replicated transactions
for a long time, the binlog_gtid_pos() call is used to fetch the full position.

> Moreover, I guess that the GTID returned by this function should not
> be the GTID of the transaction at this position, but the GTID of the
> previous transaction.  This needs to read from the beginning of the
> binary logs (reading a binary log backward is not possible to my
> knowledge).  If the binlog file size is 100 GB... (you can see my
> point I think).  Also, if the previous position is not in the same

Yes, you are right, there will be a need to scan the most recent binlog. So
there is the potential for a performance regression, more so with large binlog
files and/or frequent slave connects.

The intention was to have an index on the binlog files so that the GTID
position can be found quickly (both for the gtid_slave_pos() call in the
non-GTID case, and for GTID connect). But pressure to get the feature out
meant it was released without, and I agree that this was unfortunate.

Jonas Oreland said he has a patch already that implements this, and that will
be contributed soon...

> write domain as the current transaction, the transaction from the
> right write domain (and all the other transaction from the other write
> domains) must be found.  This looks terribly inefficient.

> write domain is in the header of the binlog).  I would prefer to
> forbid the slave to use automatic positioning (with GTID) until it had
> read the header of the next binlog.  Basically, I prefer to push

That could be reasonable.

The counter-argument is that we need binlog indexing anyway for GTID mode. And
with binlog indexes, the overhead for binlog_gtid_pos() will be negligible. So
we could avoid the complications of introducing new kinds of states of a slave
("has a valid GTID position" vs. "does not have a valid GTID position").

In any case, if this causes a performance regression in practice, we will find
some solution. Waiting to get the position from the header of the next binlog
as you suggested, or some option to disable the binlog_gtid_pos() call, or
something.

> The need for that "SELECT binlog_gtid_pos(...)" makes it very hard to
> implement the MariaDB Slave Protocol in the Binlog Server is a
> "simple" way.  If it is not needed in the protocol, I would prefer to
> simplify the slave protocol than to complexify the Binlog Server.

The binlog server can probably just ignore that call. It should not cause
problems - only the automatic switch to GTID mode will not work, but I think
the binlog server does not support GTID anyway.

 - Kristian.
 


Follow ups

References