maria-discuss team mailing list archive
Mailing list archive
Fix different gtid-positions on domain 0 in multi-master
After moving half of the databases on our primary master-master-cluster
to a different cluster, we have a problem on our backup-server which is
now a slave of both servers.
master1 <-> master2 -> backup-slave
Topology during migrations:
master1 <-> master2 -> backup-slave <-> master3 <-> master4
master1 <-> master2 -> backup-slave <- master3 <-> master master4
The gtid_domain_ids before/during the migration were:
master1/master2 : 0
master3/master4 : 20
master2 -> backup: Replicate_Do_Domain_Ids: 0
master3 -> backup: Replicate_Do_Domain_Ids: 20
backup->master3: Replicate_Do_DB: DbToMigrate1,DbToMigrate2,etc
Server version: 10.1.9
So after the migration, a typical gtid_position would be on:
As long as I keep the connection backup->master running (safe, because
no new transactions on the migrated databases are occurring anymore on
master1/2), the position on domain 0 gets recorded on master3.
The problem is, as soon as I stop that connection, that master2 and
master3 have different gtid-positions for domain0, and stop/start on
replication master3->backup results in the error:
"Got fatal error 1236 from master when reading data from binary log:
'Error: connecting slave requested to start from GTID 0-1-3898746614,
which is not in the master's binlog'
I have tried moving master1/2 to domain_id:1, and removing the
domain_id:0 from the gtid_slave_pos on backup, but starting the
replication master2->backup results in the error:
Got fatal error 1236 from master when reading data from binary log:
'Could not find GTID state requested by slave in any binlog files.
Probably the slave state is too old and required binlog files have been
I have tried to find a way to insert an empty transaction, with the last
gtid on domain_id:0 on the master3, to bring master2/master3 in sync
again on that domain, but I could not find a way to do that on MariaDB.
I tried finding a way to purge domain:0 from master3/master4, but the
only way sofar I have found is doing a "RESET MASTER" on master3, which
would break replication between master3 and master4.
Are there other ways to fix this issue, so I can have reliable
replication master3->backup without having to keep the dummy replication