← Back to team overview

maria-developers team mailing list archive

Re: 2be9b69f4ff: MDEV-23001 Precreate static Item_bool() to simplify code

 

Hi!

<cut>

> > MDEV-23001 Precreate static Item_bool() to simplify code
> >
> > The following changes where done:
> > - Create global Item: Item_false and Item_true
> > - Replace all creation if 'FALSE' and 'TRUE' top level items used for
> >   WHERE/HAVING/ON clauses to use Item_false and Item_true.
> >
> > The benefit are:
> > - No test needed if we where able to create the new item.
> > - Fixed possible errors if 'new' would have failed for the Item_bool's
>
> agree
>
> > - Less and faster code
>
> I don't quite agree with that, it's not less code:

It is less code to add usage of a static bool item and it is faster to execute
it (no fix fields etc).

> >  7 files changed, 65 insertions(+), 43 deletions(-)
>
> and I doubt that speedup can be measured in any benchmarks whatsoever.

I agree. However this is a test for being able to create static items, that is
useful in any case.

> > diff --git a/sql/item.cc b/sql/item.cc
> > index 1a86b8b3114..0f160fa257b 100644
> > --- a/sql/item.cc
> > +++ b/sql/item.cc
> > @@ -55,7 +55,8 @@ const char *item_empty_name="";
> >  const char *item_used_name= "\0";
> >
> >  static int save_field_in_field(Field *, bool *, Field *, bool);
> > -
> > +const Item_bool_static Item_false("FALSE", 0);
> > +const Item_bool_static Item_true("TRUE", 1);
>
> The main question - we have tons of modifiable Item members. How do you
> know none of them will modified in your Item_bool_static objects?

Beause these are global const. We will get an assert if anyone tries.


> > @@ -421,6 +421,9 @@ Item::Item(THD *thd):
> >     /* Initially this item is not attached to any JOIN_TAB. */
> >    join_tab_idx= MAX_TABLES;
> >
> > +   /* thd is NULL in case of static items like Item_true */
> > +  if (!thd)
> > +    return;
>
> No, please, don't.
> thd can be NULL only twice, for statically initialized Item_true and
> Item_false. For that you removed a useful assert and added an if() that
> is completely unnecessary for billions of items created runtime.
>
> Better to have special contructor just for these two very special items.

You mean like Item()?
Good idea, will do that.

However, when trying that, it was not very easy because of the Item
hierachy that bar has created :(
Items are depending on a lot of other items. This is just the tip of
the iceberg:

diff --git a/sql/item.cc b/sql/item.cc
index 49e7b9a6266..fb63b4b7fb2 100644
--- a/sql/item.cc
+++ b/sql/item.cc
@@ -421,9 +421,6 @@ Item::Item(THD *thd):
    /* Initially this item is not attached to any JOIN_TAB. */
   join_tab_idx= MAX_TABLES;

-   /* thd is NULL in case of static items like Item_true */
-  if (!thd)
-    return;
   /* Put item in free list so that we can free all items at end */
   next= thd->free_list;
   thd->free_list= this;
@@ -441,6 +438,21 @@ Item::Item(THD *thd):
   }
 }

+/*
+  This is only used for static const items
+*/
+
+Item::Item():
+  is_expensive_cache(-1), rsize(0), name(null_clex_str), orig_name(0),
+  common_flags(IS_AUTO_GENERATED_NAME)
+{
+  marker= 0;
+  maybe_null= null_value= with_window_func= with_field= false;
+  in_rollup= 0;
+  with_param= 0;
+  join_tab_idx= MAX_TABLES;
+}
+

 const TABLE_SHARE *Item::field_table_or_null()
 {
diff --git a/sql/item.h b/sql/item.h
index e188663fddd..a9b622553ae 100644
--- a/sql/item.h
+++ b/sql/item.h
@@ -963,6 +963,7 @@ class Item: public Value_source,
      optimisation changes in prepared statements
   */
   Item(THD *thd, Item *item);
+  Item();                                        /* For const item */
   virtual ~Item()
   {
 #ifdef EXTRA_DEBUG
@@ -2866,6 +2867,7 @@ class Item_basic_constant :public Item_basic_value
 {
 public:
   Item_basic_constant(THD *thd): Item_basic_value(thd) {};
+  Item_basic_constant(): Item_basic_value() {};
   bool check_vcol_func_processor(void *arg) { return false; }
   const Item_const *get_item_const() const { return this; }
   virtual Item_basic_constant *make_string_literal_concat(THD *thd,
@@ -3247,7 +3249,8 @@ class Item_literal: public Item_basic_constant
 public:
   Item_literal(THD *thd): Item_basic_constant(thd)
   { }
-  Type type() const override { return CONST_ITEM; }
+  Item_literal(): Item_basic_constant()
+    Type type() const override { return CONST_ITEM; }
   bool check_partition_func_processor(void *int_arg) override { return false;}
   bool const_item() const override { return true; }
   bool basic_const_item() const override { return true; }
@@ -3260,6 +3263,7 @@ class Item_num: public Item_literal
 {
 public:
   Item_num(THD *thd): Item_literal(thd) { collation= DTCollation_numeric(); }
+  Item_num(): Item_literal() { collation= DTCollation_numeric(); }
   Item *safe_charset_converter(THD *thd, CHARSET_INFO *tocs) override;
   bool get_date(THD *thd, MYSQL_TIME *ltime, date_mode_t fuzzydate) override
   {
@@ -4239,6 +4243,10 @@ class Item_int :public Item_num
       unsigned_flag= flag;
     }
   Item_int(THD *thd, const char *str_arg, size_t length=64);
+  Item_int(ulonglong i, size_t length):
+    Item_num(), value((longlong)i)
+    { max_length=(uint32)length; unsigned_flag= 1; }
+
   const Type_handler *type_handler() const override
   { return type_handler_long_or_longlong(); }
   Field *create_field_for_create_select(MEM_ROOT *root, TABLE *table) override
@@ -4272,6 +4280,7 @@ class Item_bool :public Item_int
   Item_bool(THD *thd, const char *str_arg, longlong i):
     Item_int(thd, str_arg, i, 1) {}
   Item_bool(THD *thd, bool i) :Item_int(thd, (longlong) i, 1) { }
+  Item_bool(bool i) :Item_int((longlong) i, 1) { }
   bool is_bool_literal() const override { return true; }
   void print(String *str, enum_query_type query_type) override;
   Item *neg_transformer(THD *thd) override;
@@ -4293,7 +4302,7 @@ class Item_bool_static :public Item_bool
 {
 public:
   Item_bool_static(const char *str_arg, longlong i):
-    Item_bool(NULL, str_arg, i) {};
+    Item_bool(str_arg, i) {};

   void set_join_tab_idx(uint join_tab_idx_arg) override
   { DBUG_ASSERT(0); }

And this is not even the full list.
I prefer the single test for THD == NULL instead of having a lot of
items taking parameters without
THD which is much easier to use wrongly.

> > +/*
> > +  This function is needed to ensure that Item_bool_static doesn't change
> > +  the value of the member str_value.
> > +*/
> > +
> > +void Item_bool::print(String *str, enum_query_type query_type)
> > +{
> > +  // my_charset_bin is good enough for numbers
> > +  String tmp(value ? (char*) "1" : (char*) "0" , 1, &my_charset_bin);
> > +  str->append(tmp);
> > +}
>
> This is not needed anymore. Item_bool inherits from Item_int,
> and Item_int::print is

>   void Item_int::print(String *str, enum_query_type query_type)
>   {
>     StringBuffer<LONGLONG_BUFFER_SIZE> buf;
>     // my_charset_bin is good enough for numbers
>     buf.set_int(value, unsigned_flag, &my_charset_bin);
>     str->append(buf);
>   }
>
> You removed str_value from Item_int::print in July.

Thanks, good point!  Will fix.

> >  Item *Item_bool::neg_transformer(THD *thd)
> >  {
> >    value= !value;
> > diff --git a/sql/item.h b/sql/item.h
> > index a1c288ab1f0..1087c08869e 100644
> > --- a/sql/item.h
> > +++ b/sql/item.h
> > @@ -1976,7 +1976,7 @@ class Item: public Value_source,
> >    virtual bool limit_index_condition_pushdown_processor(void *arg) { return 0; }
> >    virtual bool exists2in_processor(void *arg) { return 0; }
> >    virtual bool find_selective_predicates_list_processor(void *arg) { return 0; }
> > -  bool cleanup_is_expensive_cache_processor(void *arg)
> > +  virtual bool cleanup_is_expensive_cache_processor(void *arg)
>
> You've added a new virtual method? First you try so hard to remove
> virtual methods, then you add a new one specifically for these two
> static items. Why does it matter what
> cleanup_is_expensive_cache_processor will do with them? You overwrite
> is_expensive() anyway.

The purpose if cleanup_is_expensive_cache_processor() is to mark all items that
they are expensive.  This happens if the item contains a sub query. We
cannot have this
for const or shared items.
(I don't understand why we have to mark all items and just not the top
item or the ones
that contains a sub query.  This is something that I have to talk with
Spetrunia about).
In that sense, it has nothing to with is_expensive().

> > diff --git a/sql/opt_index_cond_pushdown.cc b/sql/opt_index_cond_pushdown.cc
> > index 15bc2074e1f..e950ad1b7ca 100644
> > --- a/sql/opt_index_cond_pushdown.cc
> > +++ b/sql/opt_index_cond_pushdown.cc
> > @@ -185,8 +185,8 @@ bool uses_index_fields_only(Item *item, TABLE *tbl, uint keyno,
> >  static Item *make_cond_for_index(THD *thd, Item *cond, TABLE *table, uint keyno,
> >                                   bool other_tbls_ok)
> >  {
> > -  if (!cond)
> > -    return NULL;
> > +  if (!cond || cond->basic_const_item())
> > +    return cond;
>
> The effect of this change is that you don't set
> cond->marker= ICP_COND_USES_INDEX_ONLY for basic_const_item's anymore.

Yes, which is actually correct.  The marker is not useful for these.

> This looked suspicious at first. But then I noticed (and Sergey Petrunia
> confirmed) that ICP_COND_USES_INDEX_ONLY is not used at all. Since 5.3.
>
> So here I'd suggest to revert this your change and completely remove any
> ICP_COND_USES_INDEX_ONLY manipulations together with the
> ICP_COND_USES_INDEX_ONLY definition.

Will do that in a separate patch.

> >    if (cond->type() == Item::COND_ITEM)
> >    {
> >      uint n_marked= 0;
> > diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc
> > index 119c7360f07..e63e120a02b 100644
> > --- a/sql/sql_parse.cc
> > +++ b/sql/sql_parse.cc
> > @@ -2395,6 +2395,7 @@ dispatch_command_return dispatch_command(enum enum_server_command command, THD *
> >      thd->update_server_status();
> >      thd->protocol->end_statement();
> >      query_cache_end_of_result(thd);
> > +    /* Check that the query didn't change const items */
>
> what does that mean? Is it a TODO comment? Or it documents what happens
> below?

This must be there after some rebases. I have now removed it.

> >    }
> >    if (drop_more_results)
> >      thd->server_status&= ~SERVER_MORE_RESULTS_EXISTS;
> > diff --git a/sql/sql_select.cc b/sql/sql_select.cc
> > index 93f5d3591ed..894d511adf4 100644
> > --- a/sql/sql_select.cc
> > +++ b/sql/sql_select.cc
> > @@ -2364,7 +2364,7 @@ int JOIN::optimize_stage2()
> >    if (!conds && outer_join)
> >    {
> >      /* Handle the case where we have an OUTER JOIN without a WHERE */
> > -    conds= new (thd->mem_root) Item_bool(thd, true); // Always true
> > +    conds= (Item*) &Item_true;
>
> Hmm, does casting const away even work in the Intel compiler?

Yes, it works.  The item is still in the const segments, so any writes
to it will cause a sigseg.

> >    }
> >
> >    if (impossible_where)
> > @@ -22743,6 +22743,8 @@ make_cond_for_table_from_pred(THD *thd, Item *root_cond, Item *cond,
> >        return new_cond;
> >      }
> >    }
> > +  else if (cond->basic_const_item())
> > +    return cond;
> >
> >    if (is_top_and_level && used_table == rand_table_bit &&
> >        (cond->used_tables() & ~OUTER_REF_TABLE_BIT) != rand_table_bit)
>
> this is rather fragile, isn't it?
> You skip item->set_join_tab_idx(join_tab_idx_arg),
> quite understandably.

Not really. For a basic const item, it is always safe to return just
cond, as there is nothing else to optimize for it.
In any case, the above is always false because the function starts with:

 if (used_table && !(cond->used_tables() & used_table))
    return (COND*) 0;                // Already checked
This means that used_table must be 0 here for a basic const items
(that has used_tables() == 0)

> And currently, as far as I can see, item->get_join_tab_idx()
> is always used under if (!item->const_item()), so what you did is safe.

> But can it change in the future? Or is join_tab_idx something that
> conceptually doesn't make sense for const items?

item_basic_const has:
  join_tab_idx= MAX_TABLES;

this means it is applicable to all join levels and void
JOIN::get_partial_cost_and_fanout() should not
do anything with it.

In general, ignore basic_const items for conds are safe as there is
nothing we can optimize with them.
There is only two cases:
1 -> return all rows
0 -> don't return any rows.

Regards,
Monty


Follow ups

References