maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #09722
Parallel replication slave stops with duplicate key errors
Hi Kristian,
I was trying out parallel replication and got an error that I am not sure what
to do with.
== Setup ==
I started a master with binlog_format=row,
I ran sysbench there to load the dataset ("prepare" mode, about 10M rows IIRC)
then I restarted the master and ran sysbench in "run" mode a few times.
Then I restarted the master again an ran a "CREATE TABLE" statement manually.
Just so that I have third binlog.
The master state is:
mysql> show binary logs;
+-------------+-----------+
| Log_name | File_size |
+-------------+-----------+
| pslp.000001 | 194918772 |
| pslp.000002 | 28895350 |
| pslp.000003 | 589 |
+-------------+-----------+
== Non-parallel test run ==
Then I connected with a slave, with all default settings. The slave had empty
datadir, so I made it a slave with this command:
CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_PORT=3310, MASTER_USER='root';
All server settings were pretty much default. It took a slave one hour to catch
up, but it did so, and encountered no errors.
Ok, this was serial replication
== Parallel test run ==
Then I killed the slave,
deleted its data directory, put an empty directory instead
I added this to slave.cnf file:
slave_parallel_threads=4
slave_parallel_mode=aggressive
and started the slave.
It ran in parallel for a few minutes, I saw this:
mysql> show processlist;
+----+-------------+-----------+------+---------+------+-----------------------------------------------+------------------+----------+
| Id | User | Host | db | Command | Time | State | Info | Progress |
+----+-------------+-----------+------+---------+------+-----------------------------------------------+------------------+----------+
| 3 | root | localhost | test | Query | 0 | init | show processlist | 0.000 |
| 4 | system user | | NULL | Connect | 15 | Waiting for master to send event | NULL | 0.000 |
| 5 | system user | | NULL | Connect | 0 | Write_rows_log_event::write_row(-1) | NULL | 0.000 |
| 6 | system user | | NULL | Connect | 0 | Waiting for work from SQL thread | NULL | 0.000 |
| 7 | system user | | NULL | Connect | 0 | Write_rows_log_event::write_row(-1) | NULL | 0.000 |
| 8 | system user | | NULL | Connect | 0 | Waiting for work from SQL thread | NULL | 0.000 |
| 9 | system user | | NULL | Connect | 14 | Waiting for room in worker thread event queue | NULL | 0.000 |
+----+-------------+-----------+------+---------+------+-----------------------------------------------+------------------+----------+
7 rows in set (0.00 sec)
but then it stopped. The error log has this:
2016-06-15 15:09:58 140702586379008 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log './psergey-desktop-relay-bin.000001' position: 4
2016-06-15 15:09:59 140702586985216 [Note] InnoDB: Online DDL : Start
2016-06-15 15:09:59 140702586985216 [Note] InnoDB: Online DDL : Start reading clustered index of the table and create temporary files
2016-06-15 15:09:59 140702586985216 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
2016-06-15 15:09:59 140702586985216 [Note] InnoDB: Online DDL : Start applying row log
2016-06-15 15:09:59 140702586985216 [Note] InnoDB: Online DDL : End of applying row log
2016-06-15 15:09:59 140702586985216 [Note] InnoDB: Online DDL : Completed
2016-06-15 15:11:18 140702586985216 [ERROR] Slave SQL: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062: Duplicate entry '0-1021' for key 'PRIMARY', Gtid 0-1-1021, Internal MariaDB error code: 1942
2016-06-15 15:11:18 140702586985216 [ERROR] Slave (additional info): Duplicate entry '0-1021' for key 'PRIMARY' Error_code: 1062
2016-06-15 15:11:18 140702586985216 [Warning] Slave: Duplicate entry '0-1021' for key 'PRIMARY' Error_code: 1062
2016-06-15 15:11:18 140702586985216 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pslp.000002' position 8266
2016-06-15 15:11:18 140702586682112 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702586682112 [Warning] Slave: Duplicate entry '821827' for key 'PRIMARY' Error_code: 1062
2016-06-15 15:11:18 140702586682112 [Warning] Slave: Deadlock found when trying to get lock; try restarting transaction Error_code: 1213
2016-06-15 15:11:18 140702586682112 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702586682112 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pslp.000002' position 8266
2016-06-15 15:11:18 140702602623744 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702602623744 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702602623744 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pslp.000002' position 8266
2016-06-15 15:11:18 140702602926848 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702602926848 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702602926848 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pslp.000002' position 8266
2016-06-15 15:11:18 140702586985216 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702586985216 [Warning] Slave: Duplicate entry '0-1021' for key 'PRIMARY' Error_code: 1062
2016-06-15 15:11:18 140702586985216 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702586985216 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pslp.000002' position 8266
2016-06-15 15:11:18 140702586682112 [ERROR] Slave (additional info): Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:18 140702586682112 [Warning] Slave: Duplicate entry '821827' for key 'PRIMARY' Error_code: 1062
... and there were lots of these errors. Eventually it stopped with:
2016-06-15 15:11:22 140702586682112 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:22 140702586682112 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:22 140702586682112 [Warning] Slave: Commit failed due to failure of an earlier commit on which this one depends Error_code: 1964
2016-06-15 15:11:22 140702586682112 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pslp.000002' position 8266
2016-06-15 15:11:22 140702586379008 [Note] Slave SQL thread exiting, replication stopped in log 'pslp.000002' at position 8266
The status is:
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 127.0.0.1
Master_User: root
Master_Port: 3310
Connect_Retry: 60
Master_Log_File: pslp.000003
Read_Master_Log_Pos: 589
Relay_Log_File: psergey-desktop-relay-bin.000003
Relay_Log_Pos: 8549
Relay_Master_Log_File: pslp.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1942
Last_Error: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062: Duplicate entry '0-1021' for key 'PRIMARY'
Skip_Counter: 0
Exec_Master_Log_Pos: 8266
Relay_Log_Space: 223815659
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1942
Last_SQL_Error: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062: Duplicate entry '0-1021' for key 'PRIMARY'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: aggressive
Can you advise, what does this error mean?
(is there any benefit in cc'ing this to maria-developers@ ? if yes, feel free
to..)
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog
Follow ups