← Back to team overview

maria-developers team mailing list archive

Re: [Commits] Rev 4367: MDEV-7026: Race in InnoDB/XtraDB mutex implementation can stall or hang the server. in http://bazaar.launchpad.net/~maria-captains/maria/5.5


Hi Kristian,

On Thu, Nov 20, 2014 at 02:28:54PM +0100, Kristian Nielsen wrote:
> Sergey Vojtovich <svoj@xxxxxxxxxxx> writes:
> >> +/*
> >> +  os_atomic_test_and_set_byte_acquire() is a full memory barrier on x86. But
> >> +  in general, just an aquire barrier should be sufficient. */
> >> +# define os_atomic_test_and_set_byte_acquire(ptr, new_val) \
> >>          __sync_lock_test_and_set(ptr, (byte) new_val)
> > !!! The problem is here!!! Strictly speaking not here, but in mutex_spin_wait():
> > ...
> >         mutex_set_waiters(mutex, 1);
> >
> >         /* Try to reserve still a few times */
> >         for (i = 0; i < 4; i++) {
> >                 if (mutex_test_and_set(mutex) == 0) {
> > ...
> > We'll get stuck if "waiters= 1" will become visible after
> > "if (mutex_test_and_set(mutex))". I don't see anything preventing this reorder.
> 1. If INNODB_RW_LOCKS_USE_ATOMICS is set, then mytex_set_waiters() is
> os_compare_and_swap_ulint() which is __sync_bool_compare_and_swap(). This is
> documented as a full barrier. So it is not possible for "waiters=1" to become
> visible after "if (mutex_test_and_set(mutex))".
> set, then os_atomic_test_and_set_byte_acquire() is used in
> mutex_test_and_set(). This is a full barrier on x86 (despite using the wrong
> GCC primitive), so should be ok there. On PowerPC it is probably only an
> acquire barrier, so you are probably right that the "waiters=1" may only
> become visible after "if (mutex_test_and_set(mutex))". If I understand
> correctly, this was the same before my patch also.
> then mutex_test_and_set() uses os_fast_mutex_trylock_full_barrier(), which
> again prevents "waiter=1" to be come visible too late, at least on x86. You
> may be right that the os_mb() in os_fast_mutex_trylock_full_barrier() is no-op
> in this case, so that PowerPC has only acquire barrier in trylock(), which is
> insufficent. Again, if I understand correctly, this is unchanged by my patch.
> So as I understand it, case (1) is correct for both x86 and PowerPC. Cases (2)
> and (3) should be correct for x86 also. (2) and (3) may be incorrect for
> PowerPC, but then they would have been incorrect before also, my patch does
> not change it.
> As you mentioned, it seems likely that problematic cases (2) and (3) do not
> occur at all in our builds, only working case (1) is used. If so, it seems we
> do not actually have the problem in current binaries. Maybe we could put in
>   #ifdef __powerpc__
>   #error not supported compiler
>   #endif
> for cases (2) and (3) to ensure this.
> Just to make it clear, I find it absolutely horrible that the InnoDB mutex
> code requires completely different reasoning to explain its correctness,
> depending on #ifdefs. It is almost impossible to be sure that one really
> understands how (and if) it works. That is why I find it so scary to change it
> in a stable release, and suggested something that does not change the
> existing, working x86 implementation.
Ok so you found extra inconsistencies in this code. Sorry that I wasn't clear
enough, but what I meant was...

mutex_set_waiters() calls __sync_bool_compare_and_swap() only in 5.5/XtraDB.
In this case it is indeed Ok. But 5.5/InnoDB, 10.0/InnoDB, 10.0/XtraDB just
do waiters= 1. I don't see this fixed.

Regarding INNODB_RW_LOCKS_USE_ATOMICS - it is abused in mutex code: as it says
it is supposed to control rwlocks only.