maria-developers team mailing list archive

Thread
Date

Re: Review of MDEV-4506, parallel replication, part 1

To: monty@xxxxxxxxxxxx
From: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Date: Tue, 17 Sep 2013 11:02:09 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <21042.47030.664904.739419@narttu.askmonty.org> (Michael Widenius's message of "Fri, 13 Sep 2013 09:59:02 +0300")
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux)

Michael Widenius <monty@xxxxxxxxxxxx> writes:

> +      if (list->wakeup_subsequent_commits_running)
> +      {
> +        mysql_mutex_lock(&list->LOCK_wait_commit);
> +        list->wakeup_subsequent_commits_running= false;
> +        mysql_mutex_unlock(&list->LOCK_wait_commit);
>
> Why do we need the mutex above?

I checked this again.

We _do_ need a full memory barrier here (memory barrier is implied in taking a
mutex). Otherwise the compiler or CPU could re-order the setting of the
wakeup_subsequent_commits_running flag with the reads and writes done in the
list manipulations. This could cause the two threads to corrupt the list as
they both manipulate it at the same time.

But after carefully checking, it seems to me a barrier would be enough, we do
not need to lock and unlock the mutex.

Unfortunately, we do not have any good mechanisms for doing memory barriers in
the MariaDB source currently, AFAIK. I will put a comment here that the
lock/unlock is only needed for the memory barrier, and could be changed
later. Then we can re-visit it when the more critical things have been fixed
(and if we get memory barrier primitives in the source tree). What do you
think of this suggestion?

[I was wondering why I did it like this in the first place.

Due to chaining between the waiter's mutex and the waitees mutex, the extra
lock/unlock does prevent that wakeup_subsequent_commits2() can change the flag
in the middle of unregister_wait_for_prior_commit2() doing its
operation. Maybe that is why I did it this way. But I could not actually think
of any problems from such a change in the middle.

Or maybe I did it this way to emphasise that the code conceptually is locking
both the waitee and the waiter - but to prevent deadlocks, it temporarily
releases the waitee lock in the middle. So instead of

   LOCK a; LOCK b; UNLOCK b; UNLOCK a;

we inject a temporary unlock of a:

   LOCK a; [UNLOCK a]; LOCK b; UNLOCK b; [LOCK a]; UNLOCK a;

But it does not really make things any clearer.]

> The only place where we test this variable, as far as I can find, is in
> wait_for_commit::register_wait_for_prior_commit()

The critical place is in wait_for_commit::unregister_wait_for_prior_commit2()

> This thread doesn't know if the other thread is just before
> mysql_mutex_lock(&waitee->LOCK_wait_commit), inside the lock or after
> the lock.  So when we reset the variable should not matter.

What we need to ensure is that

 - All the reads of the list in thread 1 see only writes from other threads
   that happened _before_ the variable was reset.

 - All changes to the list in thread 2 happen only _after_ the variable was
   reset in thread 1.

So a memory barrier is needed, but as you say, lock/unlock should not be
needed.

> But somehow this sounds strange. What happens if the above variable is
> reset just before mysql_mutex_lock(&waitee->LOCK_wait_commit) in
> wait_for_commit::register_wait_for_prior_commit() ?
> Isn't the waitee added to the list when it shouldn't?

That should be ok. The caller ensures that wakeup_subsequent_commits2() will
be called _after_ the last call to register_wait_for_prior_commit() has
completed. This is done in this code in handle_rpl_parallel_thread():

        mysql_mutex_lock(&entry->LOCK_parallel_entry);
        if (entry->last_committed_sub_id < event_gtid_sub_id)
        {
          entry->last_committed_sub_id= event_gtid_sub_id;
          mysql_cond_broadcast(&entry->COND_parallel_entry);
        }
        mysql_mutex_unlock(&entry->LOCK_parallel_entry);
        rgi->commit_orderer.wakeup_subsequent_commits();

and

        mysql_mutex_lock(&entry->LOCK_parallel_entry);
        if (wait_for_sub_id > entry->last_committed_sub_id)
        {
          wait_for_commit *waitee=
            &rgi->wait_commit_group_info->commit_orderer;
          rgi->commit_orderer.register_wait_for_prior_commit(waitee);
        }
        mysql_mutex_unlock(&entry->LOCK_parallel_entry);

Thanks,

 - Kristian.

Follow ups

Re: Review of MDEV-4506, parallel replication, part 1
From: Michael Widenius, 2013-09-22

References

Review of MDEV-4506, parallel replication, part 1
From: Michael Widenius, 2013-09-13