← Back to team overview

maria-discuss team mailing list archive

Re: Known limitation with TokuDB in Read Free Replication & parallel replication ?

 

Rich Prohaska <prohaska7@xxxxxxxxx> writes:

> Is TokuDB supposed to call the thd report wait for API just prior to a
> thread about to wait on a tokudb lock?

Yes, that's basically it.

Optimistic parallel replication runs transactions in parallel, but enforces
that they commit in the original order. So suppose we have transactions T1
followed by T2 in the replication stream, and that they try to update the
same row. When T2 gets ready to commit, it needs to wait for T1 to commit
first (this is what you see in wait_for_prior_commit()). However, if T1 is
waiting on a row lock held by T2, we have a deadlock.

thd_report_wait_for() checks for this condition. If a transaction goes to
wait on a lock held by a later (in terms of in-order replication)
transaction, the later transaction is killed (using the normal thread kill
mechanism). Parallel replication then gracefully handles the kill (by
rollback and retry).

You can see in storage/xtradb/lock/lock0lock.cc how this is done for
InnoDB/XtraDB, eg. lock_report_waiters_to_mysql().

Hopefully it would be easy to hook this into TokuDB. It does require being
able to locate the transaction (and in particular the THD) that owns a given
lock. Another potential issue (at least it was for InnoDB/XtraDB) is that
thd_report_wait_for() can call back into the handlerton->kill_query method,
so the callor of thd_report_wait_for() needs to be prepared for this to
happen.

Note that we can modify/extend the thd_report_wait_for() API to work better
for TokuDB, if necessary. The current API was deliberately left "internal"
(not a service with public headerfile etc.) in anticipation that it might
need changing to better support other storage engines, such as TokuDB.

Also note that the call to thd_report_wait_for() does not need to happen
"just prior" to the lock wait - it can happen later, as long as it happens
at some point (though of course the earlier the better, in terms of more
quickly resolving the deadlock and allowing replication to proceed).

> I have been running sysbench oltp with a mariadb 10.1 master-slave
> topology.  I have not seen any replication errors when slave parallel mode
> is conservative.

No, it should not happen, because in conservative mode transactions are not
run in parallel on a slave unless they ran without lock conflicts on the
master (both transactions reached the commit point at the same time).

But in InnoDB/XtraDB, there are some interesting (but very rare) corner
cases where two transactions may or may not have lock conflicts depending on
the exact order of execution. So for these cases, the thd_report_wait_for()
mechanism is also needed.

 - Kristian.


References