maria-developers team mailing list archive

Thread
Date

MDEV-9712 Performance degradation of nested NULLIF

To: maria-developers <maria-developers@xxxxxxxxxxxxxxxxxxx>, Sergei Golubchik <serg@xxxxxxxxxxx>
From: Alexander Barkov <bar@xxxxxxxxxxx>
Date: Wed, 23 Mar 2016 10:39:23 +0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

  Hello Sergei,

Please have a look into a draft fix for MDEV-9712.

The problem happens because in 10.1 execution time for various
recursive stages (like walk, update_used_table and
propagate_equal_fields) in NULLIF is O(recursion_level^2),
because complexity is doubled on every recursion level
when we copy args[0] to args[2].


This fix simplifies some stages to make them faster.
It reduced execution time from 144 seconds to 12 seconds.

The idea is just not to iterate through args[2] when
it's just a copy of args[0] in these stages:

- Item_func_nullif::walk
- Item_func_nullif::update_used_tables

I'm not fully sure that skipping args[2] is always correct though.
For some transformers it may not be correct.


<offtopic>
Btw, walk() can be called from the partitioning code when arg_count is
still 2 rather than 3. Looks suspicious. Perhaps there is a hidden bug
behind that.
</offtopic>


The remaining heavy piece of the code is:


  Item* propagate_equal_fields(THD *thd, const Context &ctx, COND_EQUAL
*cond)
  {
    Context cmpctx(ANY_SUBST, cmp.compare_type(), cmp.compare_collation());
    args[0]->propagate_equal_fields_and_change_item_tree(thd, cmpctx,
                                                         cond, &args[0]);
    args[1]->propagate_equal_fields_and_change_item_tree(thd, cmpctx,
                                                         cond, &args[1]);
    args[2]->propagate_equal_fields_and_change_item_tree(thd,
                                                         Context_identity(),
                                                         cond, &args[2]);
    return this;
  }


We cannot use the same approach here, because args[0] and args[2]
are propagated with a different context (ANY_SUBST vs IDENTITY_SUBST).
Not sure what to do with this. Perhaps, data types could have an extra
attribute to tell that ANY_SUBST and IDENITTY_SUBST are equal.
So, for example, for non-ZEROFILL integers, ANY_SUBST and IDENTITY_SUBST
do the same thing and thus propagation could be done one time only.
But this would be a partial solution, for a limited number of data types.


Another option is to leave propagation as is.

Another option is do nothing at all.
If we rewrite the reported query as CASE, it will look heavy indeed.
So it's Ok to expect heavy execution complexity.


Thanks.

diff --git a/sql/item_cmpfunc.cc b/sql/item_cmpfunc.cc
index 1ce8bad..35014ed 100644
--- a/sql/item_cmpfunc.cc
+++ b/sql/item_cmpfunc.cc
@@ -2503,6 +2503,23 @@ void Item_func_nullif::split_sum_func(THD *thd, Item **ref_pointer_array,
 }
 
 
+bool Item_func_nullif::walk(Item_processor processor,
+                            bool walk_subquery, uchar *arg)
+{
+  /*
+    No needs to iterate through args[2] when it's just a copy of args[0].
+    See MDEV-9712 Performance degradation of nested NULLIF
+  */
+  uint tmp_count= arg_count == 2 || args[0] == args[2] ? 2 : 3;
+  for (uint i= 0; i < tmp_count; i++)
+  {
+    if (args[i]->walk(processor, walk_subquery, arg))
+      return true;
+  }
+  return (this->*processor)(arg);
+}
+
+
 void Item_func_nullif::update_used_tables()
 {
   if (m_cache)
@@ -2513,7 +2530,14 @@ void Item_func_nullif::update_used_tables()
   }
   else
   {
-    Item_func::update_used_tables();
+    /*
+      No needs to iterate through args[2] when it's just a copy of args[0].
+      See MDEV-9712 Performance degradation of nested NULLIF
+    */
+    DBUG_ASSERT(arg_count == 3);
+    used_tables_and_const_cache_init();
+    used_tables_and_const_cache_update_and_join(args[0] == args[2] ? 2 : 3,
+                                                args);
   }
 }
 
diff --git a/sql/item_cmpfunc.h b/sql/item_cmpfunc.h
index 39f2cf5..ca5fc67 100644
--- a/sql/item_cmpfunc.h
+++ b/sql/item_cmpfunc.h
@@ -1026,6 +1026,7 @@ class Item_func_nullif :public Item_func_hybrid_field_type
   String *str_op(String *str);
   my_decimal *decimal_op(my_decimal *);
   void fix_length_and_dec();
+  bool walk(Item_processor processor, bool walk_subquery, uchar *arg);
   uint decimal_precision() const { return args[2]->decimal_precision(); }
   const char *func_name() const { return "nullif"; }
   void print(String *str, enum_query_type query_type);

Follow ups

Re: MDEV-9712 Performance degradation of nested NULLIF
From: Sergei Golubchik, 2016-04-27