← Back to team overview

maria-developers team mailing list archive

Re: [Maria-discuss] Known limitation with TokuDB in Read Free Replication & parallel replication ?

 

Rich Prohaska <prohaska7@xxxxxxxxx> writes:

> The group lock retry algorithm is on the  https://github.com/
> prohaska7/tokuft/tree/killwait branch.  Its unit tests pass.  Needed to add
> some test only functions to get reproducible behaviour.
>
> The group lock retry algorithm is integrated into my mariadb server on the
> https://github.com/prohaska7/mariadb-server/tree/toku_opr3  branch.  Ran
> sysbench oltp on a small 1000 row table successfully.

Looks great, thanks! It passes tests for me, as well.

> I am going to write up the tokudb lock tree races that were fixed and email
> to George Lorch @ Percona so that this code can be integrated into
> PerconaFT.

Ok, sounds great!

I will push the replication part of the patch to MariaDB 10.1, then (the
async deadlock kill).

>From the git history, it looks like new TokuDB releases (from Percona
Server) are regularly merged into MariaDB 10.1, so I'm thinking that we can
get your TokuDB/tokuft changes into MariaDB that way, in the next regular
TokuDB merges. I will check it and add any missing MariaDB stuff, if it is
not part of the changes that go upstream.

Does that sounds ok to you?

> Removed the lock wait for report from the lock request start method since
> it is redundant with the report that will occur when the lock request is
> retried in the lock request wait method.

The reason I added this reporting originally was for the case where a
deadlock is detected.

If transaction T1 tries to get a lock with lock_request::start(), but a
deadlock is detected (DB_LOCK_DEADLOCK is returned), the lock request wait
method will not be called (if I understand the code correctly), so the
reporting in lock_request::start() was not redundant.

The rationale is that if T1 gets aborted due to a deadlock with T2, and T2
is later in the replication commit order, then when T1 is run again by
replication, it will almost certainly conflict with T2 again. So we might as
well get T2 killed early (by doing the report already in start()).

But on the other hand, things will work correctly without any reporting in
start(), and with only a slight delay in case of a conflict. And the
assumption in optimistic parallel replication is that conflicts will be
relatively rare. So I'm fine without reporting in start(), as you have in
your current code.


Looks like we are close now to having optimistic parallel replication
working with TokuDB. Thanks for all your work on this, Rich!

 - Kristian.


References