maria-developers team mailing list archive

Thread
Date

Re: Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs

To: Sergey Vojtovich <svoj@xxxxxxxxxxx>
From: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Date: Thu, 06 Nov 2014 21:23:42 +0100
Cc: MariaDB Developers <maria-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <20141106190040.GA1990@june> (Sergey Vojtovich's message of "Thu, 6 Nov 2014 23:00:41 +0400")
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux)

Sergey Vojtovich <svoj@xxxxxxxxxxx> writes:

> Never seen this before. InnoDB of 5.5 and InnoDB/XtraDB of 10.0 both have:
>
> UNIV_INTERN
> void
> mutex_set_waiters(
> /*==============*/
>         ib_mutex_t*     mutex,  /*!< in: mutex */
>         ulint           n)      /*!< in: value to set */
> {
>         volatile ulint* ptr;            /* declared volatile to ensure that
>                                         the value is stored to memory */
>         ut_ad(mutex);
>
>         ptr = &(mutex->waiters);
>
>         os_wmb;
>
>         *ptr = n;               /* Here we assume that the write of a single
>                                 word in memory is atomic */
> }
>
> But I confirm that I see this in XtraDB of 5.5.

Interesting. Clearly I got confused between 5.5/10.0 and the XtraDB/InnoDB
distinction. There might be more to learn here.

In fact, I was looking mostly at the assembler output, to be sure I correctly
understood all the #ifdef magic and actual instructions generated, and only
refering back to source code for the description.

Looks like storage/innobase have a __sync_lock_test_and_set() (with a full
barrier on x86) in the mutex_test_and_set() instead. But making sure that all
the different combinations are correct looks like it could be a lot of fun...

> Agree. What we basically do is:
> lock_word= 0; // atomic store with release barrier
> isync;
> if (waiters) // non-atomic load

But that's wrong, isn't it? Doesn't the isync need to go _after_ the
conditional branch?

isync discards any following speculated execution. So if you load a value and
make a conditional jump that depends on the value loaded, then no instructions
can start until after the load completed. Because the load must complete
before the conditional can be decided. This effectively gives LoadLoad and
LoadStore order.

But isync _before_ a conditional - I am not sure that will do anything. But I
guess, this was your point, as well ;-)

> Here we need to ensure that store completes before load. But:
> - there is no isync on x86
> - isync doesn't guarantee StoreLoad order (there are many articles about this)

Agree.

> So full memory barrier seem to be the only option.

Yes...

>> 2. In a user hang I investigated in detail, a thread was waiting on
>> log_sys->mutex, but no other thread was holding that mutex (consistent with
>> lost wakeup). The server error monitor thread was itself stuck on a mutex,
>> which would prevent sync_arr_wake_threads_if_sema_free() from hiding the
>> problem.
> Strange... Monty should have fixed this. Error monitor thread should call
> log_get_lsn_nowait(), which basically does trylock. Do you happen to have call
> trace?

This investigation was some time ago (maybe 1-2 months). It seems likely that
this was a version before Monty's log_get_lsn_nowait() (eg. 10.0.14 has it,
but not 10.0.13).

> Do you have a link to articles that define full memory barrier for mutex

Hm, I don't really have a reference to point to. Well, I believe the posix
thread definition says that a number of operations ensure full memory ordering
across them.

> lock/unlock? Probably that was quite outdated writing. See e.g.:

Outdated, perhaps, but aquire/release is AFAIK a relatively new concept.
Mutexes, and InnoDB source code predates that? I'm not sure that C++ coming up
with some strange new semantics for their language has much bearing on legacy
code?

But I agree it would be nice to have some references about "old-style" mutexes
implying full memory barrier, so that it's not just me...

> http://en.cppreference.com/w/cpp/atomic/memory_order
>
> ...Atomic load with memory_order_acquire or stronger is an acquire operation.
> The lock() operation on a Mutex is also an acquire operation...
>
> ...Atomic store with memory_order_release or stronger is a release operation.
> The unlock() operation on a Mutex is also a release operation...

Interesting... so C++ defines a "Mutex" with different semantics than what is
usually understood with eg. pthread_mutex...

> Full memory barriers are way too heavy even for mutexes. All we need to to is
> to ensure that:
> - all loads following lock are performed after lock (acquire)
> - all stores preceding unlock are completed before unlock (release)

Are you sure?

Note that if this is true, then it means that there is _no_ way to get a
StoreLoad barrier using only normal mutex operations. That seems odd...

I know I have seen, and written, code that depends on lock/unlock being full
barrier. How can we be sure that such code doesn't also exist in InnoDB?

Though I agree that full barrier is a lot more expensive than just LoadLoad or
StoreStore, and best avoided if possible (I even blogged about that not so
long ago).

 - Kristian.

Follow ups

Re: Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs
From: Kristian Nielsen, 2014-11-07
Re: Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs
From: Sergey Vojtovich, 2014-11-07

References

Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs
From: Kristian Nielsen, 2014-11-06
Re: Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs
From: Sergey Vojtovich, 2014-11-06
Re: Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs
From: Kristian Nielsen, 2014-11-06
Re: Memory barrier problem in InnoDB/XtraDB mutex implementation causing hangs
From: Sergey Vojtovich, 2014-11-06