maria-discuss team mailing list archive

Thread
Date
New Question: MariaDB Galera cluster gtid's falling out of sync inspite of setting wsrep_cluster

To: maria-discuss@xxxxxxxxxxxxxxxxxxx
From: AskMonty KB <noreply@xxxxxxxxxxxx>
Date: Sat, 11 Jun 2016 18:35:04 -0000
Hello,

A new question has been asked in "MariaDB Community" by kk1674. Please answer it at http://mariadb.com/kb/en/mariadb-galera-cluster-gtids-falling-out-of-sync-inspite-of-setting-wsrep_c/ as the person asking the question may not be subscribed to the mailing list.

--------------------------------
I have asynchronous replication between 2 clusters (calling master cluster, slave cluster). As expected a node in the slave cluster receives async replication and galera transmits to the rest of the nodes of the slave cluster.  When the gtid counter is being incremented on the nodes of this slave cluster, the node acting as the slave node is keeping track of master server id and incrementing the gtid by wsrep_domain_id, server_id, galera counter+1. However the other nodes of the slave cluster are not keeping track of the server id variable and increment gtid by wsrep_domain_id, galera counter+1. See below

Slave node of the slave cluster
show variables like '%gtid%';
+------------------------+-------------------------+
| Variable_name          | Value                   |
+------------------------+-------------------------+
| gtid_binlog_pos        | 2-100-36261             |
| gtid_binlog_state      | 2-200-36263,2-100-36261 |
| gtid_current_pos       | 1-100-36261             |
| gtid_domain_id         | 22                      |
| gtid_ignore_duplicates | OFF                     |
| gtid_seq_no            | 0                       |
| gtid_slave_pos         | 1-100-36261             |
| gtid_strict_mode       | OFF                     |
| last_gtid              | 2-200-36263             |
| wsrep_gtid_domain_id   | 2                       |
| wsrep_gtid_mode        | ON                      |
+------------------------+-------------------------+
11 rows in set (0.00 sec)

Another node in the slave cluster
show variables like '%gtid%';
+------------------------+-------------------------+
| Variable_name          | Value                   |
+------------------------+-------------------------+
| gtid_binlog_pos        | 2-100-36265             |
| gtid_binlog_state      | 2-200-36264,2-100-36265 |
| gtid_current_pos       | 1-100-36259             |
| gtid_domain_id         | 22                      |
| gtid_ignore_duplicates | OFF                     |
| gtid_seq_no            | 0                       |
| gtid_slave_pos         | 1-100-36259             |
| gtid_strict_mode       | OFF                     |
| last_gtid              |                         |
| wsrep_gtid_domain_id   | 2                       |
| wsrep_gtid_mode        | ON                      |
+------------------------+-------------------------+
11 rows in set (0.00 sec)

This behavior has been noted in MariaDB 10.1.11, 10.1.13 and 10.1.14.  In 10.1.11, gtid_slave_pos table is being replicated by galera hence if the slave node goes down it is possible to move the slave process to another node, by first setting the gtid_slave_pos variable. However starting 10.1.13 the table is not longer being replicated I see it as a fix for another issue agreeably, but now there is no means to determine the slave position to be able to start replication on another node of the cluster.

Can this behavior be classified as bug and a bug request opened?

--------------------------------

To view or answer this question please visit: http://mariadb.com/kb/en/mariadb-galera-cluster-gtids-falling-out-of-sync-inspite-of-setting-wsrep_c/