maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #09488
Re: Deadlock with STOP SLAVE and LOCK_active_mi
Hi, Kristian!
On Apr 15, Kristian Nielsen wrote:
> So while looking at MDEV-9573, I found this code in
> mysql_execute_command() for STOP SLAVE:
>
> mysql_mutex_lock(&LOCK_active_mi);
> if ((mi= (master_info_index->
> get_master_info(&lex_mi->connection_name,
> Sql_condition::WARN_LEVEL_ERROR))))
> if (!stop_slave(thd, mi, 1/* net report*/))
> my_ok(thd);
> mysql_mutex_unlock(&LOCK_active_mi);
>
> So basically, this code is holding LOCK_active_mi for the entire
> duration of the STOP SLAVE operation.
>
> This seems completely broken. Anything in a slave thread that tries to
> take LOCK_active_mi will then deadlock with the STOP SLAVE operation.
> It is simple enough to trigger, testcase (with sleep-injecting patch)
> at the end of the email. This uses INFORMATION_SCHEMA.SESSSION_STATUS;
> I could imagine accessing a number of system variables could trigger
> it as well.
>
> From a quick check, it looks like this has been like this forever, eg.
> 5.1 has similar code.
>
> Any suggestions for how this is supposed to work? Or is it just broken
> by design, but saved because normally slave threads do not need to
> access SHOW STATUS or system variables?
These locks were added in this commit:
commit f808030
Author: unknown <guilhem@xxxxxxxxx>
Date: Thu Mar 11 16:23:35 2004 +0100
Fix for BUG#2921 "Replication problem on mutex lock in mySQL-4.0.18":
re-using unused LOCK_active_mi to serialize all administrative
commands related to replication:
START SLAVE, STOP SLAVE, RESET SLAVE, CHANGE MASTER, init_slave()
(replication autostart at server startup), end_slave() (replication
autostop at server shutdown), LOAD DATA FROM MASTER.
This protects us against a handful of deadlocks (like BUG#2921
when two START SLAVE, but when two STOP SLAVE too).
Removing unused variables.
So, apparently LOCK_active_mi was unused at that time and Guilhem took
it to serialize replication administrative commands. Now this lock is
used. The fix, I suppose, should be to create a new lock and use it
here. Something like LOCK_serialize_replication_admin_commands.
Regards,
Sergei
Chief Architect MariaDB
and security@xxxxxxxxxxx
Follow ups
References