← Back to team overview

maria-developers team mailing list archive

Re: 答复: 答复: 答复: MDEV-520: consider parallel replication patch from taobao patches

 

丁奇 <dingqi.lxb@xxxxxxxxxx> writes:

>    These cases failed because the slave is waiting for the error number 1062, which is ignore in Parallel mode.
> 4) rpl.rpl_row_basic_3innodb
>    rpl_row_basic_2myisam


> Finally, there are some test-cases that are too complex and need your help.

>    rpl.rpl_circular_for_4_hosts

Hm, I looked at this. For me, it fails like this:

CURRENT_TEST: rpl.rpl_circular_for_4_hosts
mysqltest: In included file "./include/wait_for_slave_param.inc": 
included from ./include/wait_for_slave_sql_error.inc at line 41:
included from /home/knielsen/my/10.0/work-10.0-mdev520/mysql-test/suite/rpl/t/rpl_circular_for_4_hosts.test at line 92:
At line 115: Timeout in include/wait_for_slave_param.inc

The test sets up circular replication S1->S2->S3->S4->S1. Then it injects a
duplicate primary key error, and waits for server S3 to stop on that error
(1062 / ER_DUP_ENTRY).

But since with your patch that error is ignored, the wait times out. So this
seems to be the same as the other tests you mentioned in (4). (But let me know
if I missed something?).

----

I next want to try one thing that I may have mentioned before: Implement an
option so that threads wait with COMMIT until the previous event group has
committed. I will try to implement this and let you know how it goes.

One nice thing about such an option is that then parallel replication is
completely transparent to the application. Transactions can never be observed
in a different order on a slave than on the master. So it could perhaps even
be enabled by default. This is something I value a lot, to implement
improvements that improve things for users in the default configuration,
without DBAs having to learn of new features (and their possible limitations)
and explicitly enable them.

Of course, such option should also help remove a lot of the test failures you
mentioned.

I say "an option", as I imagine we would in any case keep the possibility of
the current behaviour in your patch, where commits can happen out-of-order.
Since enforcing commit order could decrease performance in some cases. And we
surely would want to compare the performance between the two.

I actually think in many cases performance will not suffer by enforcing
commits in-order. The main time spent in a transaction in InnoDB is in the
data changes, and these can run in parallel as before. The commit step just
marks the transaction as committed and writes the commit record to the redo
log. There only slow part of commit is the fsync() of binlog and redo log to
disk. However MariaDB has group commit, so if transaction B waits for
transaction A with COMMIT, then we could just group commit A and B together
with a single fsync(), this may even improve performance compared to
committing them separately.

In fact, as I'm writing this, I get the idea to do this waiting inside the
group commit. This could be quite elegant, let me try this.

Of course, there are some cases where enforcing commit order will reduce
performance:

 - If we have a large transaction, like bulk load of lots of data, then other
   transactions will be delayed in their commit. This means that application
   will see a larger replication delay for those transactions.

 - If more threads are waiting for previous transactions to COMMIT, more
   threads will need to be configured to keep the same amount of parallelism.

 - If a transaction is waiting instead of committing, then it can cause more
   conflicts with later transactions, reducing parallelism.

Still, with mostly small transactions that mostly do not conflict, I think
ordered commits will not reduce performance much. It would be interesting to
try, at least.

 - Kristian.


Follow ups

References