← Back to team overview

maria-developers team mailing list archive

Re: deadlock, rotate and prepared xids

 

Thanks for the analysis. Filed http://bugs.mysql.com/bug.php?id=68575 for
the frequent rotate problem.


On Tue, Mar 5, 2013 at 2:02 AM, Kristian Nielsen
<knielsen@xxxxxxxxxxxxxxx>wrote:

> MARK CALLAGHAN <mdcallag@xxxxxxxxx> writes:
>
> > http://bugs.mysql.com/bug.php?id=68569
>
> This is the classic semi-sync deadlock that you explained so many times
> already.
>
> The 5.6 group commit implements a similar technique to MariaDB, where a
> single
> thread does the fast part of commit for all transactions in the group. But
> for
> some reason, they put the semi-sync wait inside the single-threaded loop as
> well!?! So they get the classic deadlock: A waits for semi-sync
> notification
> before dropping prepared_xid counter. B waits for A to drop prepared_xid
> counter to zero before rotating binlog and releasing LOCK_log. C waits for
> B
> to release LOCK_log before it can send events to slave and get semisync
> notification for A ...
>
> > http://bugs.mysql.com/bug.php?id=68251
>
> This second one is different from the first one. One thread is trying to
> rotate the binlog, but waiting for prepared_xids count to drop to zero.
> There
> does not appear though to be any threads having prepared XIDs pending.
>
> But look at the code in binlog.cc and binlog.h that handles the
> prepared_xids
> counter:
>
>   while (get_prep_xids() > 0)
>     mysql_cond_wait(&m_prep_xids_cond, &LOCK_commit);
>
>   void dec_prep_xids() {
>     my_atomic_rwlock_wrlock(&m_prep_xids_lock);
>     int32 result= my_atomic_add32(&m_prep_xids, -1);
>     my_atomic_rwlock_wrunlock(&m_prep_xids_lock);
>     if (result == 1)
>       mysql_cond_signal(&m_prep_xids_cond);
>   }
>
> This code is trivially racy. It can easily happen that A reads the counter
> as
> 1, then B decrements the counter to zero and signals the condition (which
> does
> nothing as there is no waiter), then A goes to sleep on the condition,
> never
> to wake up.
>
> BTW, it looks like there is another bug in there. You have a stack trace
> with
> two threads, these two threads are waiting for LOCK_log to also rotate the
> binlog. So it looks like the 5.6 code is missing protection from multiple
> threads trying to rotate the binlog at the same time, which could lead to
> dummy rotations leaving empty binlog files.
>
> This makes me so sad. The code for 5.6 group commit is absolutely
> disgusting,
> and clearly written by people with absolutely no clue about what they are
> doing. The pthread_cond_signalling bug is even spelled out in the man page
> for
> pthread_cond_wait()/_signal(), they clearly did not even bother reading it!
> And this after years of you/Facebook, I, and Perconal all spending lots of
> effort solving and carefully documenting how to do this properly :-(
>
> > Did you do anything special to avoid deadlock with ::new_file_impl and
> > prepared XIDs? My memory is vague but I thought you handled it. I added
>
> With respect to the first one, I split the commit processing into two, and
> only the first part runs serialised. And the semisync wait runs during the
> second part, after the prepared_xids counter has been decremented. So
> there is
> no potential for the same deadlock.
>
> Later I changed binlog recovery to completely remove the prepared_xids
> counter
> and the associated wait/stall. That should make it safe to do the semisync
> wait early, since a pending commit can no longer block binlog rotation.
>
> With respect to the second one, having a clue about synchronisation
> primitives
> and a proper review process helps, what else can I say?
>
>  - Kristian.
>



-- 
Mark Callaghan
mdcallag@xxxxxxxxx

References