← Back to team overview

maria-developers team mailing list archive

Re: Parallel replication slave stops with duplicate key errors

 

Sergey Petrunia <sergey@xxxxxxxxxxx> writes:

> I ran sysbench there to load the dataset ("prepare" mode, about 10M rows IIRC)
> then I restarted the master and ran sysbench in "run" mode a few times.
> Then I restarted the master again an ran a "CREATE TABLE" statement manually.
> Just so that I have third binlog.

Incidentally, you can also just use FLUSH LOGS to rotate the binlog to a new
file (but master restart is fine as well, of course).

> == Parallel test run == 
> Then I killed the slave, 
> deleted its data directory, put an empty directory instead
> I added this to slave.cnf file:
>
> slave_parallel_threads=4
> slave_parallel_mode=aggressive

> 2016-06-15 15:11:18 140702586985216 [ERROR] Slave SQL: Error during
> XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1062:
> Duplicate entry '0-1021' for key 'PRIMARY', Gtid 0-1-1021, Internal
> MariaDB error code: 1942

> I noticed that gtid_slave_pos is non transactional.. I've made it

Right, so it seems that the error is caused by non-transactional
mysql.gtid_slave_pos. I was able to repeat this by running test case
rpl.rpl_parallel_optimistic with mysql.gtid_slave_pos being MyISAM.

What happens is that optimistic/aggresive mode runs conflicting transactions
in parallel and then sometimes needs to roll back. If the rollback occurs
after updating a MyISAM mysql.gtid_slave_pos table, the GTID will be left in
the table, and we get duplicate key error when the transaction is retried.

Optimistic/aggressive mode is not really useful with non-transactional
tables, but it is not supposed to break replication. Normally the code falls
back to non-parallel replication of changes to non-transactional tables, but
this is checked on the master, so does not take into account
mysql.gtid_slave_pos.

But then, if mysql.gtid_slave_pos is non-transactional, nothing can run in
parallel under optimistic/aggressive mode, which is not very useful, so
probably giving an error is better. I have filed MDEV-10242 for this, thanks
for reporting the problem!

Usually, mysql.gtid_slave_pos is created as InnoDB. Only for users that have
MyISAM as default engine is it created as MyISAM (or if they explicitly
ALTER TABLE'd it). So that is probably why it has not hit a lot of users so
far. Your setup was using a data directory from mysql-test-run, which
explicitly changes default to MyISAM, hence you got the problem.

 - Kristian.


References