← Back to team overview

maria-developers team mailing list archive

Proposed changes to GTID syntax

 

I am planning some changes to the user interface of Global Transaction
ID. Testing by Elena has shown usability issues that I want to address with
these changes, but I wanted to give everyone a change to voice their opinion
on the changes.


The problem:

The issue concerns when a slave server initially connects to the master using
GTID. The slave needs to send to the master the GTID position it wants to
start replicating from (the GTID of the last transaction executed within each
replication domain).

If the connecting server was already a slave to another master before, then
the position is the last replicated event(s), stored in table
mysql.rpl_slave_state. But if we server is the prior master that is now to
become a slave, we need to connect at the position of the last event(s) logged
by the prior master in the binlog.

The problem occurs when users do manual updates on the slave and those updates
get into the binlog of the slave. Of course, in principle one should not do
that, as it creates differences between the master and the slave. But it is a
very common thing to do among users no doubt. Maybe they need to fix some
replication problem, maybe add an account only on the slave, whatever, and
they forget to set @@sql_log_bin=0 while doing the change, so it gets into the
slave binlog.

With current code, if the slave connects to a new master right after doing the
manual changes to the binlog on the slave, it will connect at the GTID
position of those changes, not at the last replicated transaction. Because
doing changes directly on the slave server in effect makes that server
temporarily a "master". And besides, this is user error anyway, creating
divergence betheen the binlogs on slave and master server.

But I agree with Elena that this will be a nasty surprise to many users trying
out GTID for the first time. It is inconsistent with old-style replication,
where direct updates on the slave never change the slave replication
position. And many users will not read the complete documentation on GTID
before trying to use it, and will not understand the issues with manual
changes on a slave in effect turning it into a master temporarily.

Note that this affects usability when users abuse the replication, using it
against the recommended way, not taking care to keep master and slaves in
sync. For disciplined usage, eg. running with the "strict GTID" mode that we
plan to implement, these kind of problems will never occur in the first
place. But I want to do as much as possible to also make things work well for
"undisciplined" users.


The proposed solution:

There will be two ways to configure a slave to use GTID. One will be for
"disciplined" usage, same as the current code, where slave connects with
whatever is most recent in mysql.rpl_slave_state or binlog. The other will be
for "sloppy" usage, where the user will need to explicitly state whether to
use the position from the last replicated transaction, or to use what is in
the binlog.

I hope this will solve the problem. Since new users need to learn new syntax
to enable GTID anyway (CHANGE MASTER ... master_use_gtid=...), they will at
least have made a conscious choice about whether to use one syntax or the
other. And for "disciplined" users there is not really any change, they will
just use the syntax for the current behaviour.

I also think this improves things on the conceptual level. Currently, there is
too much magic about the value of the variable @@GLOBAL.gtid_pos. You can set
this manually (and that updates mysql.rpl_slave_state), but it also changes
value whenever something is added to the binlog, and you cannot manually set
it to something that conflicts with the binlog. This magic all becomes
unnecessary with the proposed changes.

So there will be some changes to the syntax for CHANGE MASTER, and some stuff
will be renamed to make things clearer.


Details:

CHANGE MASTER syntax will be modified to this:

1. CHANGE MASTER TO master_use_gtid = current_gtid_pos

This is the "disciplined" mode, the same as master_use_gtid=1 in current
code. It will use as replication start position whatever is most recent,
last replicated GTID or last binlogged GTID.

2. CHANGE MASTER TO master_use_gtid = slave_gtid_pos

This is "sloppy" mode, for pointing an existing slave to a new master. It will
only look at the last replicated GTID for the start position, not at any
transactions logged directly to the slave binlog.

3. CHANGE MASTER TO master_use_gtid = off

For completeness, to turn of using GTID to connect, go back to using old-style
replication.

There will be three new system variables introduced, replacing the current
@@GLOBAL.gtid_pos:

1. @@GLOBAL.slave_gtid_pos. Read-write. This is the set of last replicated
GTID, per replication domain.

2. @@GLOBAL.binlog_gtid_pos. Read only. This is the set of last binlogged
GTID, per domain.

3. @@GLOBAL.current_gtid_pos. Read only. This is a combination of
@@GLOBAL.slave_gtid_pos and @@GLOBAL.binlog_gtid_pos, per-domain it holds the
most recent GTID either replicated or binlogged.

The current mysql.rpl_slave_state table will be renamed to
mysql.slave_gtid_pos.

Note how the naming is now more consistent. slave_gtid_pos means the most
recently replicated GTIDs, both for table, variable, and CHANGE MASTER syntax.
Similarly, current_gtid_pos is used for both the variable and CHANGE MASTER to
mean the most recent GTID either replicated or binlogged. The user can see the
GTID position the slave will start replicating at by inspecting the value of
the corresponding variable.

When using master_use_gtid=slave_gtid_pos, the following statement can be used
on a server that used to be the master but should now become a slave:

    SET GLOBAL slave_gtid_pos = @@binlog_gtid_pos;

This will load whatever was logged to the binlog into the slave position. When
using master_use_gtid=current_gtid_position, there is no need to do anything
special, the binlog will have the most recent GTID on a prior master so it
will be used automatically.

I think these changes will be a good improvement, justifying doing the change
this late during 10.0 release state. But I very much welcome input and
suggestions for doing things differently.

 - Kristian.


Follow ups