maria-developers team mailing list archive

Thread
Date

Re: [Commits] Rev 3902: MDEV-7123: MariaDB 10.0.14 Galera node shutdown with signal 11 in lp:~maria-captains/maria/maria-10.0-galera

To: Nirbhay Choubey <nirbhay@xxxxxxxxxxx>, maria-developers@xxxxxxxxxxxxxxxxxxx
From: Kristian Nielsen <knielsen@xxxxxxxxxxxxxxx>
Date: Wed, 26 Nov 2014 12:09:40 +0100
In-reply-to: <54720e33.28568c0a.7854.0e8c@mx.google.com> (Nirbhay Choubey's message of "Sun, 23 Nov 2014 08:41:23 -0800 (PST)")
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux)

Nirbhay Choubey <nirbhay@xxxxxxxxxxx> writes:

> revno: 3902
> revision-id: nirbhay@xxxxxxxxxxx-20141123163533-9ov1yirj26cur301
> parent: nirbhay@xxxxxxxxxxx-20141119183337-n5spskf54beqbhif
> committer: Nirbhay Choubey <nirbhay@xxxxxxxxxxx>
> branch nick: maria-10.0-galera
> timestamp: Sun 2014-11-23 11:35:33 -0500
> message:
>   MDEV-7123: MariaDB 10.0.14 Galera node shutdown with signal 11 
>   
>   wsrep-patch uses same connection name for constructing Master_info
>   objects. As a result all existing wsrep Master_info objects refer
>   to same rpl_filter object. This could lead to race when multiple
>   threads try to destruct/delete Master_info object, as they would all
>   try to delete the same relay_log object.
>   Fixed by adding a mutex to protect the critical section.

The patch is ok. But please consider the following points, and act on them or
not as you see fit:

1. You are introducing a mutex lock/unlock in a general data structure used in
a couple of places (looks like keycaches and rpl_filters?). Did you consider
the performance impact of this? When will the additional locking/unlocking
take place? If it only happens in less often used operations (like
re-configuring replication filters or key caches?), then extra locking is not
an issue. If it happens often, like once per query or something like that, it
would probably be best to find a solution with less performance impact. (mutex
lock and unlock are quite expensive even if not contended).

2. Consider using explicit mysql_mutex_lock() and mysql_mutex_unlock() calls
rather than introducing extra lock()/unlock() methods. The extra methods are
needless abtractions that make it harder to understand exactly what is
happening wrt. locking.

3. Did you consider if a test case could be made for the originally reported
bug? We used to have a rule that any bug fix must include a test case that
reproduces the bug fixed (ie. the test case must fail before the patch). Not a
hard rule (a few bugs can be extremely hard to write mtr testcases for), but a
good rule nevertheless. But maybe this rule has been dropped, now that the
focus is on quantity of Jira tickes closed rather than quality.

 - Kristian.