← Back to team overview

maria-developers team mailing list archive

Re: Syntax for parallel replication

 

Sergei Golubchik <serg@xxxxxxxxxxx> writes:

> In the view of 4) above, did you consider using system variables? Like
>
>   --slave-parallel-mode={domain|groupcommit|transactional|waiting}
>
> and, the usual, --connection_name.slave-parallel-mode=... for
> multi-source. This variable can be of SET or FLAGSET type, so it could
> be set to a combination of values.

In fact, this is what I did first (use an enum system variable).
I didn't know that it was possible to use --connection_name.XXX to configure
things, which is why I thought I needed to use CHANGE MASTER instead.

I can try to figure out how --connection_name.slave-parallel-mode works and
change it back.

As I'm thinking more about this discussion, it seems clear that the
design of parallel replication in 10.0 is rather lacking, too complex and poor
configurability.

On the one hand, I would really like it to work automatically. On the other
hand, there are complex issues, and fine-grained control seems to be needed
for power users and for testing.

Best would be if it was just enabled by default. But I am not sure that is
possible, mainly because old-style binlog position is not transactional
(relay-log.info file) and thus behaves differently if parallel is
enabled. Maybe in GTID mode, we could actually have parallel be the default.

But what about jonas' suggestion of --slave-parallel-mode=auto?

Just one simple configuration option for users to enable. Once this is set,
the server will do its best to replicate in parallel as well as possible,
using though only methods that are safe no matter what the replication load is
(DDL, non-transactional statements, and so on).

(In practice, "auto" will mean the same as "transactional", InnoDB DML will be
run in parallel speculatively with prior transactions, no other
parallelisation will be made. Maybe some simple heuristics to turn of parallel
in case of many retries, if there is time to implement it).

What do you think? Is this the way forward?

Following some more detailed comments:

>>  - When replicating with non-transactional updates, or in non-gtid
>>  mode, slave state is not crash safe. This is true in non-parallel
>>  replication also, but in parallel replication, the problem seems
>>  amplified, as there may be multiple transactions in progress at the
>>  time of a crash, complicating possible manual recovery. This also
>>  suggests that parallel replication must be configurable.
>
> Hm. From reading the MDEV, I've got an idea that you won't replicate
> non-transactional updates concurrently (as they cannot be rolled back,
> so your base assumption doesn't work). Was it wrong - will you replicate
> non-transactional updates concurrently?

There are multiple ways that two transactions T1 and T2 can be run in
parallel. Here is how it is in 10.0, when --slave-parallel-threads > 0:

1. If using GTID mode, and T1 and T2 have GTIDs with different domain_ids,
then T1 and T2 will be applied in parallel without any restrictions.

2. If T1 and T2 have the same group commit id in the master binlog, then they
can be applied in parallel, but their commit step is serialised to keep the
same commit order.

3. If T1 and T2 do not have the same group commit id, then the commit step of
T1 can run in parallel with T2, but no other part of T1.

All of these work the same for non-transactional and transactional event
groups in 10.0.

With MDEV-6676, I am proposing introducing a new speculative parallel
replication mode called "transactional". If enabled, it replaces (2) and (3)
above with:

4. If T2 is transactional, and T1 is not DDL, T2 it is allowed to run in
parallel with T1, but its commit step is serialised to keep the same commit
order.

In transactional mode, a non-transactional T2 is _not_ applied in parallel
with a prior T1, unless point (1) with different domain ids apply. However,
non-transactional T2 can be applied in parallel with a following transactional
T3.

Anyway, my point is mainly that parallel replication needs to be configurable,
including the possibility to turn it off. I don't think we disagree on that.

My current proposal is that (1) can be turned on and off; and independently,
either (2,3) or (4) or none of them can be enabled (but not both).

> 3. How to specify the degree of parallelization - the number of threads?
>    Still --slave-parallel-threads=N ? You syntax doesn't seem to cover
>    that.

The degree of parallelism is controlled by three points:

--slave-parallel-threads=N specifies the number of threads that are used. This
is a static pool of threads, cannot be changed unless all multi-source slaves
are stopped. The threads are shared among all multi-source slaves.

Parallel replication always tries to maximise parallelism up to the
--slave-parallel-threads=N limit. Every event from the binlog is queued for a
new thread, in round-robin fashion. But often, the actual parallelism will be
lower than N, because of the contraints between applying particular
transactions in parallel, as described above.

--slave-domain-parallel-threads optionally limits how many threads can be used
to replicate a single domain in a single multi-source slave. Without this, it
would be possible for one slow transaction T1 to starve other multi-source
slaves; because we might end up queueing T1, T2, T3, ... TN for all the
available threads, all of them will then wait for T1 to complete, and until
then no other threads are available to other multi-source slaves.

The static thread pool is somewhat primitive, but it is what we have now.
Probably --slave-domain-parallel-threads would be better if it was per
multi-source slave, --connection_name.slave-domain-parallel-threads.

> 1. I'd rename "groupcommit" to something less technical, like "master",
>    or "following_master", or "following", (or whatever)

I agree that "groupcommit" is rather too technical. However,
"following_master" is too generic, it doesn't say anything.

The point is, we will run transactions in parallel on the slave if they
_committed_ in parallel on the master. This is already a rather technical
issue. The user needs to be aware that it is related to commit, as tuning
options like --binlog-commit-wait-count may be needed on the master.

I guess this technical nature of the "groupcommit" feature is part of the
motivation for something better with speculative replication.

In mysql 5.7, they call the corresponding feature:

    --slave-parallel-type=LOGICAL_CLOCK

However, we do not use the term LOGICAL_CLOCK in MariaDB, and it's hardly any
less technical, so doesn't seem a good solution.

BTW, I wonder if we should use the same option name? But I suppose not, it's
better to have a different name, so users will be forced to change the config,
rather than silently pick up a mysql config option with the same name but
possibly different semantics.

How about calling the option "binlog_commit" instead, to match the related
--binlog-commit-wait-* options on the master?

> As you like.
> I'd simply say that in domain-based parallel replication, the user is
> responsible for domain independence. If he has misconfigured domains,
> the server is not at fault, and we should not bother covering this use
> case.

Yes, this is how it is currently.

However, it just seemed to me, that it would be useful if domain-based
parallel replication could be turned on or off, independently of the other
modes of parallel replication. There might be use cases where different
domains were used, without the intention that they could be replicated
out-of-order.

 - Kristian.


Follow ups

References