← Back to team overview

maria-developers team mailing list archive

Re: MDEV-21318: Wrong results with window functions and implicit grouping

 

Hi Varun,
> revision-id: fc34657511a9aa08dd92f7363dc53f58934f9673 (mariadb-10.2.29-62-gfc34657511a)
> parent(s): f0aa073f2bf3d8d85b3d028df89cdb4cdfc4002d
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2019-12-17 16:40:06 +0530
> message:
> 
> MDEV-21318: Wrong results with window functions and implicit grouping
> 
> The issue here is for degenerate joins we should execute the window
> function but it is not getting executed in all the cases.
> 
> To get the window function values window function needs to be executed
> always. This currently does not happen in few cases
> where the join would return 0 or 1 row like
>   1) IMPOSSIBLE WHERE
>   2) IMPOSSIBLE HAVING
>   3) MIN/MAX optimization
>   4) EMPTY CONST TABLE
>   5) ZERO LIMIT
> 
> The fix is to make sure that window functions get executed
> and the temporary table is setup for the execution of window functions
> 
> ---
>  mysql-test/r/win.result | 51 ++++++++++++++++++++++++++++++++++++++++++++
>  mysql-test/t/win.test   | 56 +++++++++++++++++++++++++++++++++++++++++++++++++
>  sql/sql_select.cc       | 33 ++++++++++++++++++++++++++++-
>  sql/sql_select.h        |  1 +
>  4 files changed, 140 insertions(+), 1 deletion(-)
...

> diff --git a/sql/sql_select.cc b/sql/sql_select.cc
> index c9cb533aa33..f6943b18cee 100644
> --- a/sql/sql_select.cc
> +++ b/sql/sql_select.cc
> @@ -1447,6 +1447,7 @@ JOIN::optimize_inner()
>          zero_result_cause= "Zero limit";
>        }
>        table_count= top_join_tab_count= 0;
> +      implicit_grouping_with_window_funcs();

not needed + redundant (I'll elaborate below)

>        error= 0;
>        goto setup_subq_exit;
>      }
> @@ -1502,6 +1503,7 @@ JOIN::optimize_inner()
>  	zero_result_cause= "No matching min/max row";
>          table_count= top_join_tab_count= 0;
>  	error=0;
> +        implicit_grouping_with_window_funcs();

not needed + redundant (I'll elaborate below)

>          goto setup_subq_exit;
>        }
>        if (res > 1)
> @@ -1517,6 +1519,7 @@ JOIN::optimize_inner()
>        tables_list= 0;				// All tables resolved
>        select_lex->min_max_opt_list.empty();
>        const_tables= top_join_tab_count= table_count;
> +      implicit_grouping_with_window_funcs();

This doesn't seem to be needed? 

>        /*
>          Extract all table-independent conditions and replace the WHERE
>          clause with them. All other conditions were computed by opt_sum_query
> @@ -1615,6 +1618,7 @@ JOIN::optimize_inner()
>      zero_result_cause= "no matching row in const table";
>      DBUG_PRINT("error",("Error: %s", zero_result_cause));
>      error= 0;
> +    implicit_grouping_with_window_funcs();

OK 
>      goto setup_subq_exit;
>    }
>    if (!(thd->variables.option_bits & OPTION_BIG_SELECTS) &&
> @@ -1639,6 +1643,7 @@ JOIN::optimize_inner()
>      zero_result_cause=
>        "Impossible WHERE noticed after reading const tables";
>      select_lex->mark_const_derived(zero_result_cause);
> +    implicit_grouping_with_window_funcs();

Ok
>      goto setup_subq_exit;
>    }
>  
> @@ -1781,6 +1786,7 @@ JOIN::optimize_inner()
>      zero_result_cause=
>        "Impossible WHERE noticed after reading const tables";
>      select_lex->mark_const_derived(zero_result_cause);
> +    implicit_grouping_with_window_funcs();
>      goto setup_subq_exit;

There is no test coverage for this case. Please add.

>    }
>  
> @@ -18225,7 +18231,8 @@ void set_postjoin_aggr_write_func(JOIN_TAB *tab)
>      }
>    }
>    else if (join->sort_and_group && !tmp_tbl->precomputed_group_by &&
> -           !join->sort_and_group_aggr_tab && join->tables_list)
> +           !join->sort_and_group_aggr_tab && join->tables_list &&
> +           join->top_join_tab_count)
>    {
>      DBUG_PRINT("info",("Using end_write_group"));
>      aggr->set_write_func(end_write_group);
> @@ -26924,6 +26931,30 @@ Item *remove_pushed_top_conjuncts(THD *thd, Item *cond)
>    return cond;
>  }
>  
> +/*
> +  There are 5 cases in which we shortcut the join optimization process as we
> +  conclude that the join would be a degenerate one
> +    1) IMPOSSIBLE WHERE
> +    2) IMPOSSIBLE HAVING
> +    3) MIN/MAX optimization (@see opt_sum_query)
> +    4) EMPTY CONST TABLE
> +    5) ZERO LIMIT
> +  If a window function is present in any of the above cases then to get the
> +  result of the window function, we need to execute it. So we need to

Do we really need to do this for cases #2 and #5? In these case, no ouptut is
produced, why compute the window functions?

> +  create a temporary table for its execution. Here we need to take in mind
> +  that aggregate functions and non-aggregate function need not be executed.
> +
> +*/
> +
> +
> +void JOIN::implicit_grouping_with_window_funcs()

The name of the function doesn't look right. It's a noun ("subject") while
most functions start with verb (e.g. JOIN::optimize_...)
 
handle_implicit_grouping_with_window_funcs() isn't ideal either but would be
better I think.

> +{
> +  if (select_lex->have_window_funcs() && send_row_on_empty_set())
> +  {
> +    const_tables= top_join_tab_count= table_count= 0;
> +  }
> +}
> +
>  /**
>    @} (end of group Query_Optimizer)
>  */

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog