← Back to team overview

maria-developers team mailing list archive

Re: [Commits] Rev 4346: MDEV-7026 - Occasional hang during startup on Power8 in lp:maria/5.5

 

Hi Kristian,

thanks for your feedback! Since we're short on time and you have an idea of a
proper fix, what if we do it the other way around: you'll implement a fix and
I'll review it from PPC64 perspective?

I'm sorry but I won't be able to come up with a reasonable patch in that short
time frame.

Thanks,
Sergey

On Tue, Nov 18, 2014 at 04:59:55PM +0100, Kristian Nielsen wrote:
> Sergey Vojtovich <svoj@xxxxxxxxxxx> writes:
> 
> > revno: 4346
> > revision-id: svoj@xxxxxxxxxxx-20141118080742-sdf93kgsgxulenvb
> > parent: knielsen@xxxxxxxxxxxxxxx-20141112101013-t63ayylsk08ncgs3
> > committer: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> > branch nick: 5.5
> > timestamp: Tue 2014-11-18 12:07:42 +0400
> > message:
> >   MDEV-7026 - Occasional hang during startup on Power8
> 
> Right, so you requested second review ASAP, I'm trying to understand what's
> going on here.
> 
> If my understanding is correct, this patch goes on top of current 5.5.40, not
> on top of the patch I reviewed previously.
> 
> So there are two cases, depending on whether INNODB_RW_LOCKS_USE_ATOMICS is
> set or not. It is my understanding that this is "usually" set, eg. on our
> common x86 and powerpc linux binaries.
> 
> The critical thing for the patch at hand is the memory barrier between
> mutex_set_waiters() and mutex_test_and_set() in the waiter, and
> mutex_reset_lock_word() and mutex_get_waiters() in the thread releasing the
> lock.
> 
> A full memory barrier is needed in the waiter, to be sure that
> mutex_set_waiters() becomes visible to the releasing thread before
> mutex_test_and_set() runs. And a full barrier is needed in the releasing
> thread, to be sure that mutex_reset_lock_word() becomes visible before
> mutex_get_waiters() runs.
> 
> So let's see how this is ensured. First the case of x86 with
> INNODB_RW_LOCKS_USE_ATOMICS set, HAVE_ATOMIC_BUILTINS, pre-5.5.40:
> 
>  - mutex_set_waiters() uses os_compare_and_swap_ulint() which is
>    __sync_bool_compare_and_swap() which is full barrier.
> 
>  - mutex_reset_lock_word() uses os_atomic_test_and_set_byte() which is
>    __sync_lock_test_and_set(). This is documented only as a release barrier,
>    but is in fact a full barrier on x86 with current GCC.
> 
> So in this case full barrier is there as needed.
> 
> If HAVE_ATOMIC_BUILTINS is not set, then mutex_reset_lock_word() uses
> os_fast_mutex_unlock() which is pthread_mutex_unlock(), which is also full
> barrier on x86.
> 
> If INNODB_RW_LOCKS_USE_ATOMICS is not set, then mutex_set_waiters() has no
> barriers. But then mutex_test_and_set() instead has a full barrier on
> x86. Either in os_atomic_test_and_set_byte() which is
> __sync_lock_test_and_set(), if HAVE_ATOMIC_BUILTINS is set, or in
> os_fast_mutex_trylock() otherwise. Both of these are full barriers on x86.
> 
> So this is why things work ok on x86, apparently. Very bad code that seriously
> needs cleanup and proper documentation of memory barrier semantics. But it is
> what we have to work with in 5.5.
> 
> So to make things work on PowerPC without risk of breaking x86, we could do
> this:
> 
>  - Make sure os_atomic_test_and_set_byte() is full barrier on PowerPC:
> 
> #ifdef __powerpc__
> /*
>   os_atomic_test_and_set_byte() should imply a full barrier. But
>   __sync_lock_test_and_set() is only documented as an aquire barrier. So on
>   PowerPC we need to add the full barrier explicitly.
> */
> # define os_atomic_test_and_set_byte(ptr, new_val) \
>         do { __sync_lock_test_and_set(ptr, (byte) new_val); \
>              __atomic_thread_fence(__ATOMIC_SEQ_CST); } while(0)
> #else
> /*
>   On x86, __sync_lock_test_and_set() happens to be full barrier, due to
>   LOCK prefix.
> */
> # define os_atomic_test_and_set_byte(ptr, new_val) \
>         __sync_lock_test_and_set(ptr, (byte) new_val)
> #endif
> 
> 
>  - Make sure that mutex_reset_lock_word() is a full barrier when
>    HAVE_ATOMIC_BUILTINS is not set, by adding a memory barrier:
> 
>         os_fast_mutex_unlock(&(mutex->os_fast_mutex));
> #ifdef __powerpc__
>         /*
>           On PowerPC, pthread_mutex does not imply full barrier, so we need to
>           add it explicitly.
>         */
>         __atomic_thread_fence(__ATOMIC_SEQ_CST);
> #endif
> 
> 
>  - For the case of neither HAVE_ATOMIC_BUILTINS nor
>    INNODB_RW_LOCKS_USE_ATOMICS set, similarly add a full barrier in
>    mutex_test_and_set() for PowerPC:
> 
> #ifdef __powerpc__
>         /*
>           On PowerPC, pthread_mutex does not imply full barrier, so we need to
>           add it explicitly.
>         */
>         __atomic_thread_fence(__ATOMIC_SEQ_CST);
> #endif
> 	ret = os_fast_mutex_trylock(&(mutex->os_fast_mutex));
> 
> 
> Then in 10.1. if someone is brave enough, we can clean this up properly.
> Clearly define what the memory barrier semantics is, and implement it using
> the correct primitives that work as needed on all platforms.
> 
> So now that I understand (hopefully...) how the code used to work, and how it
> can be fixed for PowerPC without changing existing code, let me look at the
> new patch ...
> 
> 
> > === modified file 'storage/innobase/include/os0sync.h'
> > --- a/storage/innobase/include/os0sync.h      2014-09-08 15:10:48 +0000
> > +++ b/storage/innobase/include/os0sync.h      2014-11-18 08:07:42 +0000
> > @@ -310,8 +310,16 @@ Returns the old value of *ptr, atomicall
> >  # define os_atomic_test_and_set_byte(ptr, new_val) \
> >       __sync_lock_test_and_set(ptr, (byte) new_val)
> >  
> > +#ifdef __powerpc__
> >  # define os_atomic_lock_release_byte(ptr) \
> >       __sync_lock_release(ptr)
> > +#else
> > +/* In theory __sync_lock_release should be used to release the lock.
> > +Unfortunately, it does not work properly alone. The workaround is
> > +that more conservative __sync_lock_test_and_set is used instead. */
> > +# define os_atomic_lock_release_byte(ptr) \
> > +     __sync_lock_test_and_set(ptr, 0)
> > +#endif
> 
> This is very confusing. You define a method with full barrier semantics, but
> call it something suggesting release semantics.
> 
> Why not revert the original change so that mutex_reset_lock_word() uses
> os_atomic_test_and_set_byte()? And then fix the implementation of
> os_atomic_test_and_set_byte() to be full barrier on PowerPC, as I suggested
> above? The term "atomic" generally implies full barrier in the MySQL source
> code...
> 
> >  #elif defined(HAVE_IB_SOLARIS_ATOMICS)
> >  
> > @@ -428,7 +436,7 @@ clobbered */
> >  # define os_rmb      __atomic_thread_fence(__ATOMIC_ACQUIRE)
> >  # define os_wmb      __atomic_thread_fence(__ATOMIC_RELEASE)
> >  #ifdef __powerpc__
> > -# define os_isync  __asm __volatile ("isync":::"memory")
> > +# define os_isync __atomic_thread_fence(__ATOMIC_SEQ_CST)
> >  #else
> >  #define os_isync do { } while(0)
> >  #endif
> 
> As Jonas suggested, don't call a full memory barrier "isync". And do not
> introduce something that is a barrier on one platform but a no-operation on
> another platform. But ...
> 
> > === modified file 'storage/innobase/sync/sync0sync.c'
> > --- a/storage/innobase/sync/sync0sync.c       2014-11-03 13:43:44 +0000
> > +++ b/storage/innobase/sync/sync0sync.c       2014-11-18 08:07:42 +0000
> > @@ -474,10 +474,9 @@ mutex_set_waiters(
> >  
> >       ptr = &(mutex->waiters);
> >  
> > -        os_wmb;
> > -
> >       *ptr = n;               /* Here we assume that the write of a single
> >                               word in memory is atomic */
> > +        os_isync;
> >  }
> 
> With this, the semantics of how things work becomes even more complicated. Now
> on PowerPC the barrier is in one function, but on x86 the barrier is in
> another function.
> 
> Why not instead do as I suggested above? Put in full barriers for PowerPC in
> the same places that they are in x86?
> 
> ----
> 
> Maybe I should try to say if I think the patch fixes the actual bug, in case
> you don't want to go with my suggestions and instead keep your approach? Since
> it goes back to using __sync_lock_test_and_set() on x86, and since it has a
> full barrier in mutex_exit_func() on PowerPC, I think perhaps it does. But it
> is really hard to understand exactly what is going on, as you can see above I
> had to do a lot of reasoning to understand even why the old code worked...
> 
> My advice is still to revert the original incorrect PowerPC patch and do
> something that is at least consistent with the existing code...
> 
> (And then feel free to do a proper cleanup in 10.1).
> 
> Hope this helps,
> 
>  - Kristian.


Follow ups

References