maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #07312
Possible race condition or NULL pointer triggered by OQGraph
Hi all
(Cross-posted to oqgraph-developers, maria-developers)
I am trying to track down a segfault apparently triggered by concurrent
execution of queries through OQGraph - for context, see:
https://mariadb.atlassian.net/browse/MDEV-6282
I am however a bit confused as to what is going on, because at least some of
the information leads me to suspect that there could be more going on here
than just a simple race condition.
--Immediate Cause--
On the surface, a segfault appears to be caused in a method in sql_class.h,
Statement::check_limit_rows_examined() dereferencing a NULL pointer, `lex`.
The method check_limit_rows_examined() was called from sql_class.h method
handler::increment_statistics() .
--Analysis--
A quick inspection reveals that a NULL value for `lex` is actually perfectly
valid some of the time, although I am nowhere near familiar enough with the
code to really understand when or why. Specifically, `lex` is a member of the
base class Statement, which is initialised to either point to THD::main_lex
via a constructor , or intentionally to NULL from construction of an object
Prepared_statement() (sql_prepare.cc)
To my mind, if there is going to be a possibility of a pointer being NULL,
then all uses of that pointer should be checking for that.
I am hoping that the bug reporter can prove or otherwise my hypothesis with
his particular scenario, although I am suspicious that there may be additional
issues to be dealt with; it is quite possible that lex being NULL is a red
herring, being caused by a race condition memory overrun elsewhere. However, I
think it is still a good idea to NULL check pointers that could be set to NULL
intentionally.
I can submit a pull request if the following patch is agreeable:
diff --git a/sql/sql_class.h b/sql/sql_class.h
index 5898d9e..957acf3 100644
--- a/sql/sql_class.h
+++ b/sql/sql_class.h
@@ -2475,8 +2475,10 @@ public:
*/
void check_limit_rows_examined()
{
- if (++accessed_rows_and_keys > lex->limit_rows_examined_cnt)
+ ++accessed_rows_and_keys;
+ if (lex && accessed_rows_and_keys > lex->limit_rows_examined_cnt) {
killed= ABORT_QUERY;
+ }
}
--Aside--
Immediately prior to the check_limit_rows_examined() call is a call to macro
status_var_increment(). Comments around the macro indicate it is often likely
to be called in a mutli-threaded context, but the data is such that it is
possible to disable protection with the only side effect being an occasionally
'inaccurate' result. Is it possible however, that check_limit_rows_examined()
is not thread-safe and should have some protection?
cheers,
Andrew
--A
Follow ups