← Back to team overview

maria-developers team mailing list archive

Re: More suggestions for changing option names for optimistic parallel replication

 

On Mon, Dec 8, 2014 at 6:45 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx> wrote:
>>> --slave-parallel-domains=on|off     (default on)
>>>
>>>     "This replaces the "domain" option of old --slave-parallel-mode. When
>>>     enabled, parallel replication will apply in parallel transactions whose
>>>     GTID has different domain ids (GTID mode only).
>>
>> I don't understand what would be the meaning of combining this flag
>> with --slave-parallel-mode. Does it mean that when this flag is on
>> transactions from different domains are executed on "all_transactions"
>> level of parallelism no matter what value --slave-parallel-mode has?
>> What will happen if this flag off but
>> --slave-parallel-mode=all_transactions?
>
> These apply on two different levels. With --slave-parallel-domains=on, each
> replication domain is replicated as completely independent streams, similar to
> different multi-source replication slaves. The position in each stream is
> tracked with each one GTID in gtid_slave_pos, and one stream can be
> arbitrarily ahead of another.

This is not entirely true, right? Let's say master binlog has
transactions T1.1, T1.2, T1.3, T1.4, T2.1, T1.5, T2.2 (where T1.* have
domain_id = 1 and T2.* have domain_id = 2) and slave has 3 parallel
threads. Then as I understand threads will be assigned to execute
T1.1, T1.2 and T1.3. T2.1 won't be scheduled to execute until these 3
transactions (or at least 2 of them T1.1 and T1.2) have been
committed. So streams from different domains are not completely
independent, right?

> The --slave-parallel-mode applies within each stream. Within one stream,
> commits are strictly ordered, and --slave-parallel-mode specifies how much
> parallelism is attempted.
>
> The --slave-parallel-mode can be set to any value and the server is
> responsible to ensure that replication works correctly. In contrast, using
> --slave-parallel-domains, it is the users/DBAs responsibility to ensure that
> replication domains are set up correctly so that no conflict can occur between
> them.
>
>> I feel like you are up to something here, but implementing it using
>> this flag is not quite right.
>
> Can you elaborate? --slave-parallel-domains controls whether we have one
> stream or many. --slave-parallel-mode controls what happens inside each
> stream. Any suggestion how to clarify?

As I pointed above the streams from multiple domains are completely
independent only when they are coming from multiple masters. When they
come from a single master they are not completely independent and that
creates a confusion (at least for me) of how these options work
together in that case.

I guess a big question I want to ask: why would someone want to use
multiple domains together with slave-parallel-domains = off? If it's a
kind of kill-switch to turn off multi-domain feature completely if it
causes troubles for some reason, then I don't think it is baked deep
enough to actually work like that. But I don't understand what else
could it be used for.

>> Hm... The fact that a transaction did a lock wait on master doesn't
>> mean that the conflicting transaction was committed on master, or that
>> both of these transactions were committed close enough to even make it
>> possible to be executed in parallel on slaves, right? Are you sure
>> that this flag will be useful?
>
> Right, and no, I'm not sure. Testing will be needed to have a better idea.
>
> If two short transactions T1 and T2 conflict on a row, T2 is quite likely to
> commit just after T1, and thus likely to conflict on the slave. So there is
> some rationale behind this.

Right. For normal slaves T2 should be committed quickly after T1, for
slaves catching up from far behind T2 should be committed in a close
proximity to T1 (distance should be less than slave-parallel-threads).
Both seem to be a very narrow use case to make it worth adding a flag
that can significantly hurt the majority of other use cases. I think
this feature will be useful only if master will somehow leave
information about which transaction T2 was in conflict with, and then
slave would make sure that T2 is not started until T1 has finished.
Though this sounds over-complicated already.

>>>     @@SESSION.replicate_expect_conflicts=0|1   (default 0)
>
>> I think this variable will be completely useless and is not worth
>> implementing. How user will understand that the transaction he is
>> about to execute is likely to conflict with another transactions
>> committed at about the same time? I think it will be completely
>> impossible to do that judgement, at the same time it will give too
>> much impact on the slave's behavior into users' hands. Am I missing
>> something? What kind of scenario you are envisioning this variable to
>> be used in?
>
> My main worry with optimistic parallel replication is if too many conflicts
> and retries on the slave will outweight the performance gained from
> parallelism. If this does not happen, I feel it will be awesome. So I was very
> focused on what to do if we _do_ get a lot of conflicts. So I wanted to give
> advanced users the possibility to work around hotspot rows basically, if
> necessary. Like single row that is updated very frequently.
>
> I did not think that this was allowing users much impact on the slave's
> behaviour. This option is only a heuristics, it controls how aggressive the
> slave will try to parallelise, but it cannot affect correctness. And the user
> alredy has a lot of ways to affect parallelism in optimistic parallel
> replication.
>
> For example, imagine lots of transactions like this executed serially on the
> master:
>
>   UPDATE t1 SET a=a+1 WHERE id=0;
>   UPDATE t1 SET a=a+1 WHERE id=0;
>   UPDATE t1 SET a=a+1 WHERE id=0;
>   ...
>
> All of these would conflict on a slave. It seems likely to cause O(N**2)
> transaction retries on the slave for --slave-parallel-mode=all_transactions
> --slave-parallel-threads=N.
>
> So the idea was that user can already cause trouble for parallelism on the
> slave; @@replicate_expect_conflicts is intended for the poweruser to be able
> to hint the slave at how to get less trouble.
>
> But I'm open to change it, if you think it's important. Your perspective is
> rather different from my usual point of view, which is useful input.

I understand everything that you say, but I think the difference
between our views is that you consider DBAs and database users to be
mostly the same people or two small groups sitting in the same room
and easily communicating with each other. For me that's not true. For
me DBAs are a distinct group of people which can sit in a different
city from users, and which may not be able to communicate with users
at all because there are hundreds of them and it's not clear whom some
particular actions belong to.
So when you say "the user already has a lot of ways to affect parallel
replication" it translates to me as "there are certain workloads when
parallel replication will behave slower than sequential". Yes, I agree
with that. If I meet such workload I will have to turn off the
parallel replication, or I (with your help) will have to find some
generic improvement to make parallel replication work better with such
workload too. And I want to underline that: improvement should be
_generic_, it should work for all users and shouldn't involve any
changes on the users' side.
When you try to give users a variable that may give them control over
treatment of hot rows, for me it means you create a tool that may be
misused by some users because they read something on the internet,
misunderstood, love doing stupid things, or whatever else. And so at
some point we may wonder why the parallel replication that we worked
so hard to setup doesn't actually work, and then find that it's just
because of users' misbehavior. Besides when such hot rows are found in
production it may be not that easy to modify users' code to add
setting of this variable because there may be many different user
groups involved and adding the variable by just half of them won't
work...

So overall I don't think this variable will be useful for large installations.


Thank you,
Pavel


Follow ups

References