← Back to team overview

maria-developers team mailing list archive

Re: Syntax for parallel replication

 

Hi, Kristian!

On Oct 16, Kristian Nielsen wrote:
> Sergei Golubchik <serg@xxxxxxxxxxx> writes:
> 
> > In the view of 4) above, did you consider using system variables? Like
> >
> >   --slave-parallel-mode={domain|groupcommit|transactional|waiting}
> >
> > and, the usual, --connection_name.slave-parallel-mode=... for
> > multi-source. This variable can be of SET or FLAGSET type, so it could
> > be set to a combination of values.
> 
> In fact, this is what I did first (use an enum system variable).
> I didn't know that it was possible to use --connection_name.XXX to configure
> things, which is why I thought I needed to use CHANGE MASTER instead.

Well, "possible" can have different meanings.

I meant that from the user point view --connection_name.variable_name
is the normal and expected way to configure per-connection variables.
Consistent with other per-connection variables and with named key
caches.

But it is well possible that there's no existing class that implements
this functionality yet.

> But what about jonas' suggestion of --slave-parallel-mode=auto?

sounds fine

> Anyway, my point is mainly that parallel replication needs to be configurable,
> including the possibility to turn it off. I don't think we disagree on that.

No, we don't. It should be configurable.

> > 1. I'd rename "groupcommit" to something less technical, like "master",
> >    or "following_master", or "following", (or whatever)
> 
> I agree that "groupcommit" is rather too technical. However,
> "following_master" is too generic, it doesn't say anything.

It means that the degree of parallelization on the slave is following
the degree of parallelization on the master. If more threads are
executed (commited, strictly speaking) in parallel on master - more
threads can be commited in parallel on the slave. If the master is
strictly single-threaded, there's only one connection doing changes -
the slave will follow that and will serialize all transactions too.

> The point is, we will run transactions in parallel on the slave if they
> _committed_ in parallel on the master. This is already a rather technical
> issue. The user needs to be aware that it is related to commit, as tuning
> options like --binlog-commit-wait-count may be needed on the master.

"following_master_commits" if you want to be really verbose, but I think
a shorter version is ok too.

> I guess this technical nature of the "groupcommit" feature is part of the
> motivation for something better with speculative replication.
> 
> In mysql 5.7, they call the corresponding feature:
> 
>     --slave-parallel-type=LOGICAL_CLOCK

This doesn't say anything either. Not until you read the manual, that
is. And if you do read the manual, then XYZ or DARK_VOODOO is almost
equally good.

> BTW, I wonder if we should use the same option name? But I suppose not, it's
> better to have a different name, so users will be forced to change the config,
> rather than silently pick up a mysql config option with the same name but
> possibly different semantics.

It is different, isn't it? You call it slave-parallel-mode.

> How about calling the option "binlog_commit" instead, to match the related
> --binlog-commit-wait-* options on the master?

but it's not only about commits, a user may want to disable all
parallelization, for example. Or to make sure that non-transactional
updates are not *run* in parallel with anything, not even with
non-commited transactions.

I like your slave-parallel-mode name.

> > As you like.
> > I'd simply say that in domain-based parallel replication, the user is
> > responsible for domain independence. If he has misconfigured domains,
> > the server is not at fault, and we should not bother covering this use
> > case.
> 
> Yes, this is how it is currently.
> 
> However, it just seemed to me, that it would be useful if domain-based
> parallel replication could be turned on or off, independently of the other
> modes of parallel replication. There might be use cases where different
> domains were used, without the intention that they could be replicated
> out-of-order.

Sure.
As you're going to have an option for selecting slave parallel mode anyway...

Regards,
Sergei


Follow ups

References