maria-discuss team mailing list archive
Mailing list archive
Galera cluster with asynchronous slave
Johnny Antonsen <johnny@xxxxxxxxx>
Thu, 03 Jul 2014 14:28:54 +0200
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
First off let show you what we are trying to achieve:
MariaDBNode1 - HAProxy - keepalived <-
MariaDBNode2 - HAProxy - keepalived <- MariaDB Slave node with GTID
MariaDBNode3 - HAProxy - keepalived <-
I'm currently working on a high-availability project where we want to
use MariaDB 10 and Galera as the backend. Running this system as it is
with Haproxy and keepalived works great. We are able to write to the VIP
that is being moved between each node and haproxy takes care of
redirecting the server with least connections.
The problem comes when we want to do a standard master/slave replication
from the cluster to an external slave. The slave is set up to connect to
the VIP (or I have also been testing with connecting directly to the
haproxied ip), and Using_gtid is set to Slave_pos.
However, after some time, once the connection changes to a different
node through haproxy, the following error occurs:
Got fatal error 1236 from master when reading data from binary log:
'Error: connecting slave requested to start from GTID 3-1-422, which is
not in the master's binlog'
And the Slave_IO_State shows that it's no longer in sync.
I have run SELECT @@GLOBAL.gtid_slave_pos; to check what the current
GTID for each node is, and they all return: 1-1-2145, however, sometimes
if I add a lot of data, that value is different on some nodes, which is
why I think the slave gets confused.
On the slave, when activating using_gtid=slave_pos, the following
gtid_IO_pos appear: 1-1-2464,2-3-420,3-1-422
From what I have read, this should be somewhat correct, as the first
value is the server id. However, in the config I have specified that
node 1 has server id 1, node 2 has id 2 and so on, and that the same
goes for gtid_domain_id. Is this the correct setup or do the nodes need
to have the same server-id or gtid_domain_id?
Surely there must be a good way to solve this? Is the system not built
to handle an asynchronous slave replicating from one random node?
Hope I can get some good feedback on this.
Just as a side-note, I'm fairly new to mariadb and galera cluster, so be
With best regards