← Back to team overview

maria-developers team mailing list archive

Re: [Commits] 88a80e92dc4: MDEV-9959: A serious MariaDB server performance bug

 

Hi Varun,

On Wed, Apr 24, 2019 at 01:31:38PM +0530, Varun wrote:
> revision-id: 88a80e92dc444ce30718f3e08d3ab66fb02bcea4 (mariadb-10.3.10-284-g88a80e92dc4)
> parent(s): 3032cd8e91f1e1ead8b6f941e75cd29e473e7eaa
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2019-04-24 13:31:24 +0530
> message:
> 
> MDEV-9959: A serious MariaDB server performance bug
> 
> Step #2: If any field in the select list of the derived tables is present in the group by list also , then we are again guaranteed
> that ref access to the derived table would always produce one row per key.
 
...

> diff --git a/sql/sql_lex.cc b/sql/sql_lex.cc
> index c52005e7683..aa645ac00ca 100644
> --- a/sql/sql_lex.cc
> +++ b/sql/sql_lex.cc
> @@ -7617,6 +7617,68 @@ Item *st_select_lex::build_cond_for_grouping_fields(THD *thd, Item *cond,
>  }
>  
>  
> +/**
> +  Check if any any item in the group by list is also present in the select_list
> +  @retval true: All elements common between select and group by list
> +*/
> +
> +void st_select_lex::is_group_by_prefix(KEY *keyinfo)

I think the name of the function is poor - "is_something()" hints at hints at boolean return
value, and I would say it doesn't imply any side effects.

Something like 'mark_derived_index_stats()' would be better, I think.


> +{
> +  uint key_parts= keyinfo->usable_key_parts;
> +  KEY_PART_INFO *key_part_info= keyinfo->key_part;
> +  bool found= FALSE;
> +
> +  if (key_parts < group_list.elements)
> +    return;
> +
> +  uint matched_fields=0, i, j;
> +  Item *item;
> +
> +  for (i= 0; i < key_parts; key_part_info++, i++)
> +  {
> +    uint fld_idx= key_part_info->fieldnr - 1;
> +    item= join->fields_list.elem(fld_idx);
> +    for (ORDER *order= group_list.first; order; order= order->next)
> +    {
> +      Item *ord_item= order->item[0]->real_item();
> +      Item_equal *item_equal= ord_item->get_item_equal();
> +
> +      if (item_equal)
> +      {
> +        Item_equal_fields_iterator it(*item_equal);
> +        Item *equal_item;
> +        while ((equal_item= it++))
> +        {
> +          if (equal_item->eq(item, 0))
> +          {
> +            matched_fields++;
> +            found= TRUE;
> +            break;
> +          }
> +        }
> +      }
> +      else
> +      {
> +        if (item->eq(ord_item, 0))
> +        {
> +          matched_fields++;
> +          found= TRUE;
> +        }
> +      }
> +      if (found)
> +        break;
> +    }
> +
> +    if (matched_fields == group_list.elements)
> +    {
> +      for (j=matched_fields - 1; j < key_parts; j++)
> +        keyinfo->rec_per_key[j]= 1;

This doesn't always work. An example:

create table t1 (a int, b int, c int, d int);
insert into t1 select a,a,a,a from one_k;

create table t2 as select * from t1;

# Note the "a, a AS b" in the select list. This is intentional.

explain select * 
from 
  t2,
  (select a, a as b from t1 group by a,b) TBL
where
  t2.a=TBL.a and t2.b=TBL.b;

Here, the select output is not guaranteed to be unique (as it contains only one
of two GROUP BY columns). But I see the execution to reac the above line,
assigning keyinfo->rec_per_key[j]= 1.

> +      return;
> +    }
> +    found= FALSE;
> +  }
> +}
> +
>  int set_statement_var_if_exists(THD *thd, const char *var_name,
>                                  size_t var_name_length, ulonglong value)
>  {

A general comment: do you think we should expose this in optimizer trace? 

At the moment, Derived-with-keys optimization doesn't print anything there
(neither in MariaDB nor in MySQL).  One can see the generated KEYUSEs when they
are printed.

But perhaps we could add a node that would print that a derived table was
added, with this cardinality, and index with these keypart cardinalities. 

This would make debugging easier. Any thoughts?

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog