← Back to team overview

maria-developers team mailing list archive

Review for version#2 of MDEV-11297: Add support for LIMIT clause in GROUP_CONCAT

 

Hi Varun,

Nice to see that nearly all comments from the previous review were addressed.
Please find my further input below.

General questions:
How is DISTINCT handled? Syntax like

  GROUP_CONCAT(DISTINCT col ORDER BY ... LIMIT ...)

is allowed by the parser, however I don't see any handling for DISTINCT in your
patch.

Did you forget to update group_concat_limit.result file? Please check.


> diff -urpN '--exclude=.*' 10.2-varun-mdev11297-orig/sql/item_sum.cc 10.2-varun-mdev11297-noc/sql/item_sum.cc
> --- 10.2-varun-mdev11297-orig/sql/item_sum.cc	2016-12-11 12:17:04.023961449 +0300
> +++ 10.2-varun-mdev11297-noc/sql/item_sum.cc	2016-12-11 12:18:40.411957365 +0300
> @@ -3114,6 +3114,11 @@ int dump_leaf_key(void* key_arg, element
>    String *result= &item->result;
>    Item **arg= item->args, **arg_end= item->args + item->arg_count_field;
>    uint old_length= result->length();
> +  ulonglong *offset_limit= &item->copy_offset_limit;
> +  ulonglong *row_limit = &item->copy_row_limit;
> +
> +  if (item->limit_clause && !(*row_limit))
> +     return 1;
>  
>    if (item->no_appended)
>      item->no_appended= FALSE;
> @@ -3122,20 +3127,23 @@ int dump_leaf_key(void* key_arg, element
>  
>    tmp.length(0);
>  
> -  for (; arg < arg_end; arg++)
> +  if (!item->limit_clause || !(*offset_limit))
>    {
> -    String *res;
> -    /*
> -      We have to use get_tmp_table_field() instead of
> -      real_item()->get_tmp_table_field() because we want the field in
> -      the temporary table, not the original field
> -      We also can't use table->field array to access the fields
> -      because it contains both order and arg list fields.
> -     */
> -    if ((*arg)->const_item())
> -      res= (*arg)->val_str(&tmp);
> -    else
> +
> +    for (; arg < arg_end; arg++)
>      {
> +      String *res;
> +      /*
> +        We have to use get_tmp_table_field() instead of
> +        real_item()->get_tmp_table_field() because we want the field in
> +        the temporary table, not the original field
> +        We also can't use table->field array to access the fields
> +        because it contains both order and arg list fields.
> +      */
> +      if ((*arg)->const_item())
> +        res= (*arg)->val_str(&tmp);
> +      else
> +      {
>        Field *field= (*arg)->get_tmp_table_field();

Please make the code to use the right indentation.

>        if (field)
>        {
> @@ -3146,9 +3154,19 @@ int dump_leaf_key(void* key_arg, element
>        }
>        else
>          res= (*arg)->val_str(&tmp);
> +      }
> +      if (res)
> +      {
> +        result->append(*res);
> +      }
>      }
> -    if (res)
> -      result->append(*res);
> +    if ((*row_limit))
> +        (*row_limit)--;
> +  }
> +  else
> +  {
> +    item->no_appended= TRUE;

If I comment out the above line, the test still passes. Please add test
coverage that shows it is needed, or remove it.

> +    (*offset_limit)--;
>    }
>  
>    item->row_count++;
> @@ -3199,7 +3217,8 @@ Item_func_group_concat::
>  Item_func_group_concat(THD *thd, Name_resolution_context *context_arg,
>                         bool distinct_arg, List<Item> *select_list,
>                         const SQL_I_List<ORDER> &order_list,
> -                       String *separator_arg)
> +                       String *separator_arg, bool limit_clause,
> +                       Item *rowLimit, Item *offsetLimit)

camelCase or javaStyleVariableNames are against the coding style.
Please use something like row_limit_arg if there is no better option.

(In general, please try to make your new code fit in with the rest of the code.
Did you see camelCase anywhere? If not, it's a sign not to use it).

>    :Item_sum(thd), tmp_table_param(0), separator(separator_arg), tree(0),
>     unique_filter(NULL), table(0),
>     order(0), context(context_arg),
> @@ -3208,7 +3227,9 @@ Item_func_group_concat(THD *thd, Name_re
>     row_count(0),
>     distinct(distinct_arg),
>     warning_for_row(FALSE),
> -   force_copy_fields(0), original(0)
> +   force_copy_fields(0),row_limit(0),
> +   offset_limit(0),limit_clause(limit_clause),
> +   copy_offset_limit(0), copy_row_limit(0),original(0)

Please use 'NULL' for pointers, not '0'.

>  {
>    Item *item_select;
>    Item **arg_ptr;
> @@ -3250,6 +3271,16 @@ Item_func_group_concat(THD *thd, Name_re
>    /* orig_args is only used for print() */
>    orig_args= (Item**) (order + arg_count_order);
>    memcpy(orig_args, args, sizeof(Item*) * arg_count);
> +
> +  if (limit_clause)
> +  {
> +    if (!(row_limit=(Item*) thd->alloc(sizeof(Item))))
> +      return;

I'm not sure what the above is supposed to mean? Why allocate sizeof(Item)
bytes?

> +    row_limit= rowLimit;

... especially when the next thing you do is the above assignment, so you 
lose the value that you've got from the thd->alloc() call?

> +    if (!(offset_limit= (Item*)thd->alloc(sizeof(Item))))
> +      return;
> +    offset_limit= offsetLimit;

The same question as above.

> +  }
>  }
>  
>  
> @@ -3269,7 +3300,9 @@ Item_func_group_concat::Item_func_group_
>    warning_for_row(item->warning_for_row),
>    always_null(item->always_null),
>    force_copy_fields(item->force_copy_fields),
> -  original(item)
> +  row_limit(item->row_limit), offset_limit(item->offset_limit),
> +  limit_clause(item->limit_clause),copy_offset_limit(item->copy_offset_limit),
> +  copy_row_limit(item->copy_row_limit), original(item->original)
>  {
>    quick_group= item->quick_group;
>    result.set_charset(collation.collation);
> @@ -3382,6 +3415,10 @@ void Item_func_group_concat::clear()
>    null_value= TRUE;
>    warning_for_row= FALSE;
>    no_appended= TRUE;
> +  if (offset_limit)
> +     copy_offset_limit= offset_limit->val_int();
> +  if (row_limit)
> +     copy_row_limit= row_limit->val_int();
>    if (tree)
>      reset_tree(tree);
>    if (unique_filter)
> diff -urpN '--exclude=.*' 10.2-varun-mdev11297-orig/sql/item_sum.h 10.2-varun-mdev11297-noc/sql/item_sum.h
> --- 10.2-varun-mdev11297-orig/sql/item_sum.h	2016-12-11 12:17:04.043961448 +0300
> +++ 10.2-varun-mdev11297-noc/sql/item_sum.h	2016-12-11 12:18:40.415957365 +0300
> @@ -1583,6 +1583,12 @@ class Item_func_group_concat : public It
>    bool always_null;
>    bool force_copy_fields;
>    bool no_appended;
> +  Item *row_limit;
> +  Item *offset_limit;
> +  bool limit_clause;
> +  ulonglong copy_offset_limit;
> +  ulonglong copy_row_limit;
> +

In previous review I have requested:
>> Please add (short) comments describing each new member variable.

And you still haven't done that.

>    /*
>      Following is 0 normal object and pointer to original one for copy
>      (to correctly free resources)
> @@ -1602,7 +1608,8 @@ protected:
>  public:
>    Item_func_group_concat(THD *thd, Name_resolution_context *context_arg,
>                           bool is_distinct, List<Item> *is_select,
> -                         const SQL_I_List<ORDER> &is_order, String *is_separator);
> +                         const SQL_I_List<ORDER> &is_order, String *is_separator,
> +                         bool limit_clause, Item *row_limit, Item *offset_limit);
>  
>    Item_func_group_concat(THD *thd, Item_func_group_concat *item);
>    ~Item_func_group_concat();
> diff -urpN '--exclude=.*' 10.2-varun-mdev11297-orig/sql/sql_yacc.yy 10.2-varun-mdev11297-noc/sql/sql_yacc.yy
> --- 10.2-varun-mdev11297-orig/sql/sql_yacc.yy	2016-12-11 12:17:04.107961446 +0300
> +++ 10.2-varun-mdev11297-noc/sql/sql_yacc.yy	2016-12-11 12:18:40.887957345 +0300
> @@ -1786,7 +1786,7 @@ bool my_yyoverflow(short **a, YYSTYPE **
>  %type <num>
>          order_dir lock_option
>          udf_type opt_local opt_no_write_to_binlog
> -        opt_temporary all_or_any opt_distinct
> +        opt_temporary all_or_any opt_distinct opt_glimit_clause
>          opt_ignore_leaves fulltext_options union_option
>          opt_not
>          select_derived_init transaction_access_mode_types
> @@ -1844,7 +1844,7 @@ bool my_yyoverflow(short **a, YYSTYPE **
>          opt_escape
>          sp_opt_default
>          simple_ident_nospvar simple_ident_q
> -        field_or_var limit_option
> +        field_or_var limit_option glimit_option
>          part_func_expr
>          window_func_expr
>          window_func
> @@ -10449,16 +10449,34 @@ sum_expr:
>          | GROUP_CONCAT_SYM '(' opt_distinct
>            { Select->in_sum_expr++; }
>            expr_list opt_gorder_clause
> -          opt_gconcat_separator
> +          opt_gconcat_separator opt_glimit_clause
>            ')'
>            {
>              SELECT_LEX *sel= Select;
>              sel->in_sum_expr--;
> -            $$= new (thd->mem_root)
> -                  Item_func_group_concat(thd, Lex->current_context(), $3, $5,
> -                                         sel->gorder_list, $7);
> -            if ($$ == NULL)
> -              MYSQL_YYABORT;
> +            if (!$8)
> +             $$= new (thd->mem_root)
> +                   Item_func_group_concat(thd, Lex->current_context(), $3, $5,
> +                                        sel->gorder_list, $7, FALSE, 0, 0);
> +            else
> +            {
> +              if (sel->select_limit && sel->offset_limit)
> +                 $$= new (thd->mem_root)
> +                        Item_func_group_concat(thd, Lex->current_context(), $3, $5,
> +                                         sel->gorder_list, $7, TRUE, sel->select_limit,
> +                                         sel->offset_limit);
> +              else if (sel->select_limit)
> +                 $$= new (thd->mem_root)
> +                        Item_func_group_concat(thd, Lex->current_context(), $3, $5,
> +                                         sel->gorder_list, $7, TRUE, sel->select_limit, 0);
> +            }
> +             if ($$ == NULL)
> +               MYSQL_YYABORT;
> +
> +            sel->select_limit= NULL;
> +            sel->offset_limit= NULL;
> +            sel->explicit_limit= 0;
> +
>              $5->empty();
>              sel->gorder_list.empty();
>            }
> @@ -10680,6 +10698,95 @@ gorder_list:
>            { if (add_gorder_to_list(thd, $1,(bool) $2)) MYSQL_YYABORT; }
>          ;

What is the difference between glimit_option and limit_option? Looking at the
grammar, they look the same. Please either remove glimit_option or explain why
do you need it.

I can see that limit_clause and glimit_clause have to be different because
limit_clause also supports LIMIT ROWS EXAMINED.

Actually, there is one thing in limit_clause that I think should be in
glimit_clause also. It's this call:

              Lex->set_stmt_unsafe(LEX::BINLOG_STMT_UNSAFE_LIMIT);

Did you omit it intentionally? Let's meet and discuss it.

> +opt_glimit_clause:
> +          /* empty */ { $$ = 0; }
> +        | glimit_clause { $$ = 1; }
> +        ;
> +
> +glimit_clause_init:
> +          LIMIT{}
> +        ;
> +
> +glimit_clause:
> +          glimit_clause_init glimit_options{}
> +        ;
> +
> +glimit_options:
> +          glimit_option
> +          {
> +            SELECT_LEX *sel= Select;
> +            sel->select_limit= $1;
> +            sel->offset_limit= 0;
> +            sel->explicit_limit= 1;
> +          }
> +        | glimit_option ',' glimit_option
> +          {
> +            SELECT_LEX *sel= Select;
> +            sel->select_limit= $3;
> +            sel->offset_limit= $1;
> +            sel->explicit_limit= 1;
> +          }
> +        | glimit_option OFFSET_SYM glimit_option
> +          {
> +            SELECT_LEX *sel= Select;
> +            sel->select_limit= $1;
> +            sel->offset_limit= $3;
> +            sel->explicit_limit= 1;
> +          }
> +        ;
> +
> +glimit_option:
> +        ident
> +        {
> +          Item_splocal *splocal;
> +          LEX *lex= thd->lex;
> +          Lex_input_stream *lip= & thd->m_parser_state->m_lip;
> +          sp_variable *spv;
> +          sp_pcontext *spc = lex->spcont;
> +          if (spc && (spv = spc->find_variable($1, false)))
> +          {
> +            splocal= new (thd->mem_root)
> +              Item_splocal(thd, $1, spv->offset, spv->sql_type(),
> +                  lip->get_tok_start() - lex->sphead->m_tmp_query,
> +                  lip->get_ptr() - lip->get_tok_start());
> +            if (splocal == NULL)
> +              MYSQL_YYABORT;
> +#ifndef DBUG_OFF
> +            splocal->m_sp= lex->sphead;
> +#endif
> +            lex->safe_to_cache_query=0;
> +          }
> +          else
> +            my_yyabort_error((ER_SP_UNDECLARED_VAR, MYF(0), $1.str));
> +          if (splocal->type() != Item::INT_ITEM)
> +            my_yyabort_error((ER_WRONG_SPVAR_TYPE_IN_LIMIT, MYF(0)));
> +          splocal->limit_clause_param= TRUE;
> +          $$= splocal;
> +        }
> +        | param_marker
> +        {
> +          $1->limit_clause_param= TRUE;
> +        }
> +        | ULONGLONG_NUM
> +          {
> +            $$= new (thd->mem_root) Item_uint(thd, $1.str, $1.length);
> +            if ($$ == NULL)
> +              MYSQL_YYABORT;
> +          }
> +        | LONG_NUM
> +          {
> +            $$= new (thd->mem_root) Item_uint(thd, $1.str, $1.length);
> +            if ($$ == NULL)
> +              MYSQL_YYABORT;
> +          }
> +        | NUM
> +          {
> +            $$= new (thd->mem_root) Item_uint(thd, $1.str, $1.length);
> +            if ($$ == NULL)
> +              MYSQL_YYABORT;
> +          }
> +        ;
> +
>  in_sum_expr:
>            opt_all
>            {

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog