← Back to team overview

maria-developers team mailing list archive

Re: More suggestions for changing option names for optimistic parallel replication

 

On Thu, Dec 4, 2014 at 5:49 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
> I discussed with Monty, and we came up with some more suggested changes for
> the options used to configure the optimistic parallel replication feature
> (MDEV-6676, https://mariadb.atlassian.net/browse/MDEV-6676).
>
> The biggest change is to split up the --slave-parallel-mode into multiple
> options. I think that is reasonable, probably that option was doing too many
> things at once.
>
> Instead, we could have the following three options:
>
> --slave-parallel-mode=all_transactions | follow_master_commits |
>                       only_commits | none
>
>     "all_transactions" is what was called "transactional" before. The slave
>     will try to apply all transactional DML in parallel; in case of conflicts
>     it will roll back the later transaction and retry it.
>
>     "follow_master_commits" is the 10.0 functionality, apply in parallel
>     transactions that group-committed together on the master (the default).
>
>     "only_commits" was suggested to me by a user testing parallel
>     replication. It does not attempt to apply transactions in parallel, but
>     still runs the commit steps in parallel, making slave group commit
>     possible and thus saving on fsyncs if durability settings are on.

Does this mean that group commit will be possible if slave is able to
execute several transactions consecutively while previous transaction
commits/fsyncs?
I'd suggest to name this option differently because looking just at
the list of available values it's not quite clear what could be the
difference between follow_master_commits and only_commits. I don't
know yet what is the best name for this. Maybe overlap_commits?

>     "none" means the parallel replication code is not used (same as
>     --slave-parallel-threads=0, but now configurable per multimaster
>     connection). (This corresponds to empty value in old
>     --slave-parallel-mode).
>
> --slave-parallel-domains=on|off     (default on)
>
>     "This replaces the "domain" option of old --slave-parallel-mode. When
>     enabled, parallel replication will apply in parallel transactions whose
>     GTID has different domain ids (GTID mode only).

I don't understand what would be the meaning of combining this flag
with --slave-parallel-mode. Does it mean that when this flag is on
transactions from different domains are executed on "all_transactions"
level of parallelism no matter what value --slave-parallel-mode has?
What will happen if this flag off but
--slave-parallel-mode=all_transactions?

I feel like you are up to something here, but implementing it using
this flag is not quite right.

> --slave-parallel-wait-if-conflict-on-master=on|off  (default on)
>
>     When enabled, if a transaction had to do a row lock wait on the master, it
>     will not be applied in parallel with any earlier transaction on the slave
>     (idea is that such transaction is likely to get a conflict on the slave,
>     causing a needless retry). (This was the "waiting" option to old
>     --slave-parallel-mode).

Hm... The fact that a transaction did a lock wait on master doesn't
mean that the conflicting transaction was committed on master, or that
both of these transactions were committed close enough to even make it
possible to be executed in parallel on slaves, right? Are you sure
that this flag will be useful?

> These options will also be usable per multi-source master connection, like
> --master1.slave-parallel-mode=all_transactions. The options will be possible
> to change dynamically also (with SET GLOBAL), though the associated slave
> threads must be stopped while changing.
>
> Also, Monty suggested to rename @@replicate_allow_parallel to
>
>     @@SESSION.replicate_expect_conflicts=0|1   (default 0)
>
> When this option is enabled on the master when a transaction is committed,
> that transaction will not be applied in parallel with earlier transactions
> (when --slave-parallel-mode=all_transactions). This can be used to reduce
> retries on the slave, if an application is about to do a transaction that is
> likely to cause a conflict and retry on a slave if applied in parallel with
> earlier transactions.

I think this variable will be completely useless and is not worth
implementing. How user will understand that the transaction he is
about to execute is likely to conflict with another transactions
committed at about the same time? I think it will be completely
impossible to do that judgement, at the same time it will give too
much impact on the slave's behavior into users' hands. Am I missing
something? What kind of scenario you are envisioning this variable to
be used in?

> Let me know if there are any comments to these or suggestions for changes. It
> is best to get these as right as possible before release (seems the intention
> is to include optimistic parallel replication in 10.1), since it is the
> user-visible part of the feature.
>
> With these option names, the normal way to use optimistic parallel replication
> would be these two options in my.cnf:
>
>    slave_parallel_mode=all_transactions
>    slave_parallel_threads=20  (or whatever)
>
> This seems reasonably, I think. None of the other options would need be
> considered except in more special cases.


Hope that helps,
Pavel


Follow ups

References