maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #07771
Re: Syntax for parallel replication
Hi, Kristian!
On Oct 06, Kristian Nielsen wrote:
> - Parallel replication is still a somewhat experimental feature, so
> it seems too risky to enable it by default. Also, it doesn't really
> seem possible for the server to automatically set the best number of
> threads to use, with current implementation (or possibly any
> implementation).
Increase parallelization when replication just works, and penalize it
when retries happen? With an upper limit similar to (or derived from)
innodb-concurrency-tickets. Just a thought.
> - When replicating with non-transactional updates, or in non-gtid
> mode, slave state is not crash safe. This is true in non-parallel
> replication also, but in parallel replication, the problem seems
> amplified, as there may be multiple transactions in progress at the
> time of a crash, complicating possible manual recovery. This also
> suggests that parallel replication must be configurable.
Hm. From reading the MDEV, I've got an idea that you won't replicate
non-transactional updates concurrently (as they cannot be rolled back,
so your base assumption doesn't work). Was it wrong - will you replicate
non-transactional updates concurrently?
> - When using domain-based parallel replication, the user is
> responsible for ensuring that independent domains are non-conflicting
> and can be replicated out-of-order wrt. each other. So if replication
> domains are used, but this property is not guaranteed, then
> domain-based parallel replication need to be configurable, or
> parallel replication cannot be used at all.
As you like.
I'd simply say that in domain-based parallel replication, the user is
responsible for domain independence. If he has misconfigured domains,
the server is not at fault, and we should not bother covering this use
case.
> - The new speculative replication feature in MDEV-6676 is not always
> guaranteed to be a win - in some workloads, where there are many
> conflicts between successive transactions, excessive rollback could
> cause it to be less efficient than not using it. Again, this suggests
> it needs to be configurable.
Agree.
Though if the concurrency will be auto-tuned as I mentioned above, it'll
auto-disable itself in this case. With no user intervention.
> So given this, I came up with the following idea for syntax:
>
> CHANGE MASTER TO PARALLEL_MODE=(domain,groupcommit,transactional,waiting)
>
> Each of the four keywords in the parenthesis is optional.
>
> "domain" enables domain-based parallelisation, where each replication domain
> is treated independently.
>
> "groupcommit" enables the non-speculative mode, where only transactions that
> group-committed together on the master are applied in parallel on the slave.
>
> "transactional" enables the speculative mode, where all transactional DML is
> optimistically tried in parallel, and then in case of conflict a rollback and
> retry is done.
>
> "groupcommit" and "transactional" are mutually exclusive, at most one of them
> can be specified.
Assorted thoughts in no specific order:
1. I'd rename "groupcommit" to something less technical, like "master",
or "following_master", or "following", (or whatever)
2. How does it work with multi-source? The usual "CHANGE MASTER name TO" ?
3. How to specify the degree of parallelization - the number of threads?
Still --slave-parallel-threads=N ? You syntax doesn't seem to cover
that.
4. Command line? None? CHANGE MASTER specifies replication coordinates,
and they change on every restart, that's why there's no command-line
option for them. They're stored in master-info.
But your "TO PARALLEL_MODE" only configures how to apply events,
seems like something that should rather be in the my.cnf.
In the view of 4) above, did you consider using system variables? Like
--slave-parallel-mode={domain|groupcommit|transactional|waiting}
and, the usual, --connection_name.slave-parallel-mode=... for
multi-source. This variable can be of SET or FLAGSET type, so it could
be set to a combination of values.
Regards,
Sergei
Follow ups
References