maria-developers team mailing list archive

Thread
Date

Re: Syntax for parallel replication

To: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
From: Sergei Golubchik <serg@xxxxxxxxxxx>
Date: Mon, 13 Oct 2014 15:01:55 +0200
Cc: MariaDB Developers <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <871tqlh3ym.fsf@frigg.knielsen-hq.org>
User-agent: Mutt/1.5.23 (2014-03-12)

Hi, Kristian!

On Oct 06, Kristian Nielsen wrote:
>  - Parallel replication is still a somewhat experimental feature, so
>  it seems too risky to enable it by default. Also, it doesn't really
>  seem possible for the server to automatically set the best number of
>  threads to use, with current implementation (or possibly any
>  implementation).

Increase parallelization when replication just works, and penalize it
when retries happen? With an upper limit similar to (or derived from)
innodb-concurrency-tickets. Just a thought.

>  - When replicating with non-transactional updates, or in non-gtid
>  mode, slave state is not crash safe. This is true in non-parallel
>  replication also, but in parallel replication, the problem seems
>  amplified, as there may be multiple transactions in progress at the
>  time of a crash, complicating possible manual recovery. This also
>  suggests that parallel replication must be configurable.

Hm. From reading the MDEV, I've got an idea that you won't replicate
non-transactional updates concurrently (as they cannot be rolled back,
so your base assumption doesn't work). Was it wrong - will you replicate
non-transactional updates concurrently?

>  - When using domain-based parallel replication, the user is
>  responsible for ensuring that independent domains are non-conflicting
>  and can be replicated out-of-order wrt. each other. So if replication
>  domains are used, but this property is not guaranteed, then
>  domain-based parallel replication need to be configurable, or
>  parallel replication cannot be used at all.

As you like.
I'd simply say that in domain-based parallel replication, the user is
responsible for domain independence. If he has misconfigured domains,
the server is not at fault, and we should not bother covering this use
case.

>  - The new speculative replication feature in MDEV-6676 is not always
>  guaranteed to be a win - in some workloads, where there are many
>  conflicts between successive transactions, excessive rollback could
>  cause it to be less efficient than not using it. Again, this suggests
>  it needs to be configurable.

Agree.
Though if the concurrency will be auto-tuned as I mentioned above, it'll
auto-disable itself in this case. With no user intervention.

> So given this, I came up with the following idea for syntax:
> 
>   CHANGE MASTER TO PARALLEL_MODE=(domain,groupcommit,transactional,waiting)
> 
> Each of the four keywords in the parenthesis is optional.
> 
> "domain" enables domain-based parallelisation, where each replication domain
> is treated independently.
> 
> "groupcommit" enables the non-speculative mode, where only transactions that
> group-committed together on the master are applied in parallel on the slave.
> 
> "transactional" enables the speculative mode, where all transactional DML is
> optimistically tried in parallel, and then in case of conflict a rollback and
> retry is done.
> 
> "groupcommit" and "transactional" are mutually exclusive, at most one of them
> can be specified.

Assorted thoughts in no specific order:

1. I'd rename "groupcommit" to something less technical, like "master",
   or "following_master", or "following", (or whatever)

2. How does it work with multi-source? The usual "CHANGE MASTER name TO" ?

3. How to specify the degree of parallelization - the number of threads?
   Still --slave-parallel-threads=N ? You syntax doesn't seem to cover
   that.

4. Command line? None? CHANGE MASTER specifies replication coordinates,
   and they change on every restart, that's why there's no command-line
   option for them. They're stored in master-info.

   But your "TO PARALLEL_MODE" only configures how to apply events,
   seems like something that should rather be in the my.cnf.

In the view of 4) above, did you consider using system variables? Like

  --slave-parallel-mode={domain|groupcommit|transactional|waiting}

and, the usual, --connection_name.slave-parallel-mode=... for
multi-source. This variable can be of SET or FLAGSET type, so it could
be set to a combination of values.

Regards,
Sergei

Follow ups

Re: Syntax for parallel replication
From: Kristian Nielsen, 2014-10-16

References

Syntax for parallel replication
From: Kristian Nielsen, 2014-10-06