maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #13161
Review #3 for: MDEV-26278 Elimination: consider GROUP BY as unique key
Hi Oleg,
Please fix compile errors on this platform:
https://buildbot.mariadb.org/#/builders/146/builds/13126/steps/5/logs/stdio
also please apply the attached diff that fixes the coding style and wording
of comments (input and/or adjustments are welcome).
After this, the patch is ok to push.
The next steps are to create a branch with name like
preview-10.10-mdev26278-table-elimination and to officially submit it for
preview tree build and testing.
> commit 7275bf4939248959b5017314d6b0393296831c0c (HEAD, origin/bb-10.9-MDEV-26278-look)
> Author: Oleg Smirnov <olernov@xxxxxxxxx>
> Date: Thu Feb 17 22:53:37 2022 +0700
>
> MDEV-26278 Add functionality to eliminate derived tables from the query
>
> Elimination of unnecessary tables from SQL queries is already present
> in MariaDB. But it only works for regular tables and not for derived ones.
>
> Imagine we have a view:
> CREATE VIEW v1 AS SELECT a, b, max(c) AS maxc FROM t1 GROUP BY a, b
>
> Due to "GROUP BY a, b" the values of combinations {a, b} are unique,
> and this fact can be treated as like derived table "v1" has a unique key
> on fields {a, b}.
>
> Suppose we have a SQL query:
> SELECT t2.* FROM t2 LEFT JOIN v1 ON t2.a=v1.a and t2.b=v1.b
>
> 1. Since {v1.a, v1.b} is unique and both these fields are bound to t2,
> "v1" is functionally dependent on t2.
> This means every record of "t2" will be either joined with
> a single record of "v1" or NULL-complemented.
> 2. No fields of "v1" are present on the SELECT list
>
> These two facts allow the server to completely exclude (eliminate)
> the derived table "v1" from the query.
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net
diff --git a/sql/opt_table_elimination.cc b/sql/opt_table_elimination.cc
index ab8bf07868b..587ff800fb8 100644
--- a/sql/opt_table_elimination.cc
+++ b/sql/opt_table_elimination.cc
@@ -1675,8 +1675,8 @@ Dep_analysis_context::create_table_value(TABLE_LIST *table_list)
if (key->flags & HA_NOSAME)
{
Dep_module_key *key_dep;
- if (!(key_dep=
- new Dep_module_key(tbl_dep, i, key->user_defined_key_parts)))
+ if (!(key_dep= new Dep_module_key(tbl_dep, i,
+ key->user_defined_key_parts)))
return NULL;
*key_list= key_dep;
key_list= &(key_dep->next_table_key);
@@ -1689,8 +1689,16 @@ Dep_analysis_context::create_table_value(TABLE_LIST *table_list)
/*
- Create unique pseudo-key and add it as a dependency
- if the query is not a UNION (ALL) and has GROUP BY expression
+ @brief
+ Check if we can create a unique pseudo-key for the passed table.
+ If we can, create a dependency for it
+
+ @detail
+ Currently, pseudo-key is created for the list of GROUP BY columns.
+
+ TODO: also it can be created if the query uses
+ - SELECT DISTINCT
+ - UNION DISTINCT (not UNION ALL)
*/
void Dep_analysis_context::create_unique_pseudo_key_if_needed(
@@ -1716,10 +1724,6 @@ void Dep_analysis_context::create_unique_pseudo_key_if_needed(
*/
if (first_select && first_select->group_list.elements > 0)
{
- /*
- Make sure GROUP BY elements contain only fields
- and no functions or other expressions
- */
bool valid= true;
std::set<field_index_t> exposed_fields_indexes;
for (auto cur_group= first_select->group_list.first;
@@ -1727,6 +1731,10 @@ void Dep_analysis_context::create_unique_pseudo_key_if_needed(
cur_group= cur_group->next)
{
auto elem= *(cur_group->item);
+ /*
+ Make sure GROUP BY elements contain only fields
+ and no functions or other expressions
+ */
if (elem->type() != Item::FIELD_ITEM)
{
valid= false;
@@ -1736,30 +1744,29 @@ void Dep_analysis_context::create_unique_pseudo_key_if_needed(
if (field_idx == -1)
{
/*
- This is not a invalid case.
- We can construct a unique pseudo-key only in case when all field from
- the GROUP BY list are present on the SELECT list. For example,
- "SELECT a FROM t1 GROUP by a, b":
- values of "a" are not unique, only a combination of "a" and "b"
- is unique. So both fields must be present in the SELECT list
+ This GROUP BY element is not present in the select list. This is a
+ case like this:
+ (SELECT a FROM t1 GROUP by a,b) as TBL
+ Here, the combination of (a,b) is unique, but the select doesn't
+ include "b". "a" alone is not unique, so TBL doesn't have a unique
+ pseudo-key.
*/
valid= false;
break;
}
-
exposed_fields_indexes.insert(field_idx);
-
}
if (valid)
{
Dep_module_pseudo_key *pseudo_key;
- pseudo_key= new Dep_module_pseudo_key(
- tbl_dep, std::move(exposed_fields_indexes));
+ pseudo_key= new Dep_module_pseudo_key(tbl_dep,
+ std::move(exposed_fields_indexes));
tbl_dep->pseudo_key= pseudo_key;
}
}
}
+
/*
Iterate the list of fields and look for the given field.
Returns the index of the field if it is found on the list
@@ -1771,16 +1778,13 @@ int Dep_analysis_context::find_field_in_list(List<Item> &fields_list,
{
List_iterator<Item> it(fields_list);
int field_idx= 0;
- while (auto next_field= it.peek())
+ while (auto next_field= it++)
{
if (next_field->eq(field, false))
- {
return field_idx;
- }
field_idx++;
- it++;
}
- return -1 /*not found*/;
+ return -1; /*not found*/
}
@@ -1941,8 +1945,7 @@ Dep_module_pseudo_key::get_next_unbound_value(Dep_analysis_context *dac,
/*
- Determine whether the field_index is present in the previously composed
- array of derived_table_field_indexes
+ Check if column number field_index is covered by the pseudo-key.
*/
bool Dep_module_pseudo_key::covers_field(int field_index)
Follow ups