Re: Review of patch for MDEV-4820


Alright. I'd say if this is the only meaning current_pos should have
then the name "current" is somewhat misleading.
But ok, I'll set both gtid_binlog_state and gtid_slave_pos. It seems
working so far.


On Sat, Aug 24, 2013 at 1:00 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
> Pavel Ivanov <pivanof@xxxxxxxxxx> writes:
>> I took 10.0-base r3685. Started new just bootstrapped server with
>> server_id = 1. It has @@global.gtid_binlog_pos,
>> @@global.gtid_slave_pos and @@global.gtid_current_pos empty. Then I
>> execute
>> set global gtid_binlog_state = '0-10-10'
>> After that @@global.gtid_binlog_pos = '0-10-10' as expected, but both
>> @@global.gtid_slave_pos and @@global.gtid_current_pos are still empty.
>> Because of that server won't be able to replicate from master.
>> If I set gtid_binlog_state to '0-1-10' though
>> @@global.gtid_current_pos changes to '0-1-10' and everything is fine.
> The short answer is that you should just set both gtid_slave_pos and
> gtid_binlog_state on the new server.
>   SET GLOBAL gtid_binlog_state = '0-10-10';
>   SET GLOBAL gtid_slave_state = @@GLOBAL.gtid_binlog_pos;
> For the longer answer, let me try to explain:
> The gtid_binlog_pos and the gtid_slave_pos are different concepts in
> MariaDB. The former is the last GTID logged into the binlog (for each
> domain). The latter is the last GTID replicated by the slave.
> These become different because on the one hand slave can use
> --log-slave-updates=0 (so binlog is not updated), and on the other hand I did
> not want to add overhead of updating gtid_slave_pos for every transaction on
> the master. So a GTID that goes into one of them may or may not go into the
> other.
> Now let us set up a slave with
>     CHANGE MASTER TO master_host= ... , master_use_gtid=slave_pos;
> The slave starts replication at the value of gtid_slave_pos. Every replicated
> GTID updates gtid_slave_pos, so to switch master we can just point it to the
> new host and it will continue from the correct point.
> But suppose we promote a new master, and later want the old master to to
> become a slave. The old master did not update gtid_slave_pos, so the point at
> which to start is the last GTID logged to the binlog, gtid_binlog_pos. Thus to
> start the old master replicating a slave one should use:
>     SET GLOBAL gtid_slave_pos = @@GLOBAL.gtid_binlog_pos;
>     CHANGE MASTER TO master_host= ... , master_use_gtid=slave_pos;
> and then things will proceed correctly with the new slave server.
> So this is how you should think of the variables. The gtid_slave_pos is the
> position at which to start replication for a slave. The gtid_binlog_pos is the
> last GTID logged into the binlog.
> Now, this creates an asymmetry - to switch a server to replicate from a new
> master, the user has to know if the server was a master or a slave before, and
> do it differently depending on which it is.
> So I wanted to provide a way to avoid this asymmetry, and I implemented CHANGE
> MASTER TO master_use_gtid=current_pos for this. In this mode, when the slave
> connects, it looks into both the gtid_slave_pos and the gtid_binlog_pos to
> decide which of these has the most recent GTID - and then uses that GTID as
> the point to start replication at.
> If server was a master before, then the last GTID in the binlog will have the
> server's own server_id; _and_ the sequence number will be bigger that what is
> in the gtid_slave_pos because sequence numbers on a master are always
> generated bigger than any seen before. So in this case we use the last GTID in
> the binlog to connect to. Otherwise we use the gtid_slave_pos.
> So that is _all_ that gtid_current_pos is - it is a way for the server to
> guess whether it was a master or a slave before, and act accordingly. A bit of
> magic for casual users that do not want to be aware of whether the server they
> are setting up as a slave was a slave already before, or a master.
> So the point is that if you want to use gtid_current_pos on a newly setup
> server, you need to provide correct values for _both_
> gtid_binlog_pos/gtid_binlog_state _and_ gtid_slave_pos. Because
> gtid_current_pos is the result of combining the two.
>> It looks like the problem is in the server_id check in the first loop
>> in rpl_slave_state::iterate(). Can it be removed from there?
> I think so - in strict mode, the most recent GTID will always be the one with
> the highest sequence number, so the server_id check is not needed. On the
> other hand, if things are done correctly, the server_id check will make no
> difference, as a GTID with different server_id cannot get into the binlog
> without also getting into gtid_slave_pos
> But for now I have other, more critical things I want to fix first - I think
> this is not a critical thing, just setting gtid_slave_pos on the new server
> should make things work for you? (else let me know if I missed something).
>  - Kristian.

