maria-developers team mailing list archive
Mailing list archive
Re: Problem with debug_sync, THD::enter_cond(), and kill ...
On Wed, Feb 26, 2014 at 11:51 AM, Kristian Nielsen
>> Looking at the code it looks like most places (including THD::awake)
>> assume that thd->mysys_var->current_mutex and
>> thd->mysys_var->current_cond can be changed only when
>> thd->mysys_var->mutex is locked. debug_sync.cc is breaking this
>> assumption. I've found that Item_func_sleep::val_int(),
>> THD::enter_cond() and THD::exit_cond() break this assumption too. I
>> believe all of those places should be fixed to lock
>> thd->mysys_var->mutex before any attempts to read or write to
>> thd->mysys_var->current_mutex and thd->mysys_var->current_cond.
> Hm... well, as I understand the code in THD::awake(), the idea is that it is
> ok for enter_cond() to _set_ the new values without locks. THD::awake() will
> use them if they are set, and ignore them if not. However, to _clear_ them,
> the mysys_var->mutex needs to be held - and awake() holds this mutex while
> using the values.
> It would seem a bit sad to introduce an extra mutex for every call to
> enter_cond(), only to satisfy some debug code which is never used in
> production. But maybe it is not so important, since that is only done when the
> thread is going to sleep on a condition anyway?
> I am also not sure if it could introduce risk of deadlocks or other issues if
> taking the mysys_var->mutex while also holding the
> mysys_var->current_mutex. There is a comment in exit_cond() that says that
> current_mutex must be unlocked before mysys_var->mutex is locked. But I do not
> know enough of this code to be sure if that is a problem also in enter_cond().
I didn't look at it this way (that current_cond and current_mutex can
go NULL -> non-NULL without mutex and then changed under mutex).
Probably that could usually work. And that would mean that only
debug_sync.cc must be changed to lock thd->mysys_var->mutex before
changing current_cond and current_mutex.
But I'd say trying to build system like that is very fragile and
unreliable. You can't trust the fact that someone won't later look at
THD::enter_cond(), decide that current_cond and current_mutex can be
changed anywhere without mutex, and add some code doing that. And you
can't even rely on the fact that compiler won't optimize something and
won't move assignment of current_mutex and current_cond into some
inappropriate place or won't even write some temporary values in there
-- compiler is totally allowed to do that. Besides the code like this
makes it completely impossible to analyze MySQL/MariaDB execution
using dynamic tools like ThreadSanitizer
is very helpful in finding bugs.
So I'd suggest to add an extra mutex or come up with some other
synchronization solution that will guarantee cleanliness without any