maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #11070
Re: [Commits] 60d1803ec64: MDEV-6707: Wrong result (extra row) with group by, multi-part key
On Thu, Jan 11, 2018 at 02:35:55AM +0530, Varun wrote:
> revision-id: 60d1803ec645b6d388aac7125ac6381ecee4d548 (mariadb-5.5.56-115-g60d1803ec64)
> parent(s): 9c9cf556a15af84f9133ef859b50596085f7ae8e
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2018-01-11 01:51:23 +0530
> message:
>
> MDEV-6707: Wrong result (extra row) with group by, multi-part key
>
> Range optimizer creates ranges according to the where conditions. So this case involves using a composite key,
> few parts of which are involved in GROUP BY and few in the MIN/MAX functions in the select list.
> So during the execution of such queries, we try to find a prefix using the fields involved in GROUP BY
> and we then check if this newly returned prefix lies within the range we had calculated earlier.
> After we find a row that satisfies the partial range(involving fields of GROUP BY), then we look at the other fields
> of the where clause. We look in all the ranges if we can satisfy the above retured row.
>
> The issue is while calculating these prefixes we might encounter same partial ranges for multiple ranges.
> An example would be
> where condition is f2 ='c' AND f1 <> 9
> so the ranges would be
> (NULL,c <= f1,f2 <= c,9)
> (c,9 <= f1,f2 <= c,+infinity)
>
> In this case for calculating prefixes with the group by field we take up the
> partial ranges involving field f2 those would be
> c<= f2 <=c
> c<= f2 <=c
> So we lookup rows with the same prefix in all such ranges and then we check for the other
> part(in this case f1) in all the ranges. So lets take a row of the table to be (4,'c'), that would have
> a prefix 'c' and value for f1 to be 4, so it would lie in the first range mentioned above. But we would
> get this same record again when we read the partial range and this would again satisfy the first range.
> This issue can be fixed if we compare such partial ranges and don't lookup if we see the same prefix again.
This is a huge and long comment.
Please make it smaller, and move at least part of it into the code. Good places
would be:
- The function comment for QUICK_RANGE_SELECT::get_next_prefix
- the new lines in QUICK_RANGE_SELECT::get_next_prefix which use/update the
value of save_last_range
>
> ---
> mysql-test/r/range.result | 14 ++++++++++++++
> mysql-test/r/range_mrr_icp.result | 14 ++++++++++++++
> mysql-test/t/range.test | 13 +++++++++++++
> sql/opt_range.cc | 20 +++++++++++++++++---
> sql/opt_range.h | 6 ++++--
> 5 files changed, 62 insertions(+), 5 deletions(-)
>
> diff --git a/mysql-test/r/range.result b/mysql-test/r/range.result
> index 630a692cef6..72b31de4df4 100644
> --- a/mysql-test/r/range.result
> +++ b/mysql-test/r/range.result
> @@ -2144,3 +2144,17 @@ value1 1000685 12345
> value1 1003560 12345
> value1 1004807 12345
> drop table t1;
> +#
> +# MDEV-6707: Wrong result (extra row) with group by, multi-part key
> +#
> +CREATE TABLE t1 (f1 INT, f2 VARCHAR(1), KEY(f2,f1)) ENGINE=InnoDB;
> +INSERT INTO t1 VALUES
> +(7,'v'),(0,'s'),(9,'l'),(4,'c');
> +explain
> +SELECT MAX(f1), f2 FROM t1 WHERE f2 LIKE 'c%' AND f1 <> 9 GROUP BY f2;
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 SIMPLE t1 range f2 f2 9 NULL 2 Using where; Using index for group-by
> +SELECT MAX(f1), f2 FROM t1 WHERE f2 LIKE 'c%' AND f1 <> 9 GROUP BY f2;
> +MAX(f1) f2
> +4 c
> +DROP TABLE t1;
> diff --git a/mysql-test/r/range_mrr_icp.result b/mysql-test/r/range_mrr_icp.result
> index 3f5de5b0189..1050bfcd887 100644
> --- a/mysql-test/r/range_mrr_icp.result
> +++ b/mysql-test/r/range_mrr_icp.result
> @@ -2146,4 +2146,18 @@ value1 1000685 12345
> value1 1003560 12345
> value1 1004807 12345
> drop table t1;
> +#
> +# MDEV-6707: Wrong result (extra row) with group by, multi-part key
> +#
> +CREATE TABLE t1 (f1 INT, f2 VARCHAR(1), KEY(f2,f1)) ENGINE=InnoDB;
> +INSERT INTO t1 VALUES
> +(7,'v'),(0,'s'),(9,'l'),(4,'c');
> +explain
> +SELECT MAX(f1), f2 FROM t1 WHERE f2 LIKE 'c%' AND f1 <> 9 GROUP BY f2;
> +id select_type table type possible_keys key key_len ref rows Extra
> +1 SIMPLE t1 range f2 f2 9 NULL 2 Using where; Using index for group-by
> +SELECT MAX(f1), f2 FROM t1 WHERE f2 LIKE 'c%' AND f1 <> 9 GROUP BY f2;
> +MAX(f1) f2
> +4 c
> +DROP TABLE t1;
> set optimizer_switch=@mrr_icp_extra_tmp;
> diff --git a/mysql-test/t/range.test b/mysql-test/t/range.test
> index 393ca68e945..62e907b7b4a 100644
> --- a/mysql-test/t/range.test
> +++ b/mysql-test/t/range.test
> @@ -1718,3 +1718,16 @@ where (key1varchar='value1' AND (key2int <=1 OR key2int > 1));
> --echo # The following must show col1=12345 for all rows:
> select * from t1;
> drop table t1;
> +
> +--echo #
> +--echo # MDEV-6707: Wrong result (extra row) with group by, multi-part key
> +--echo #
> +
> +CREATE TABLE t1 (f1 INT, f2 VARCHAR(1), KEY(f2,f1)) ENGINE=InnoDB;
> +INSERT INTO t1 VALUES
> +(7,'v'),(0,'s'),(9,'l'),(4,'c');
> +
> +explain
> +SELECT MAX(f1), f2 FROM t1 WHERE f2 LIKE 'c%' AND f1 <> 9 GROUP BY f2;
> +SELECT MAX(f1), f2 FROM t1 WHERE f2 LIKE 'c%' AND f1 <> 9 GROUP BY f2;
> +DROP TABLE t1;
> diff --git a/sql/opt_range.cc b/sql/opt_range.cc
> index 25a9e729a8b..e985aba1140 100644
> --- a/sql/opt_range.cc
> +++ b/sql/opt_range.cc
> @@ -11249,6 +11249,7 @@ int QUICK_RANGE_SELECT::get_next()
> @param prefix_length length of cur_prefix
> @param group_key_parts The number of key parts in the group prefix
> @param cur_prefix prefix of a key to be searched for
> + @save_last_range Saving the last range we encountered
You've meant "@param save_last_range" ?
Also please use " INOUT " to indicate this parameter is used to pass info both
from the caller and to the caller.
>
> Each subsequent call to the method retrieves the first record that has a
> prefix with length prefix_length and which is different from cur_prefix,
> @@ -11271,7 +11272,8 @@ int QUICK_RANGE_SELECT::get_next()
>
> int QUICK_RANGE_SELECT::get_next_prefix(uint prefix_length,
> uint group_key_parts,
> - uchar *cur_prefix)
> + uchar *cur_prefix,
> + QUICK_RANGE **save_last_range)
> {
> DBUG_ENTER("QUICK_RANGE_SELECT::get_next_prefix");
> const key_part_map keypart_map= make_prev_keypart_map(group_key_parts);
> @@ -11301,8 +11303,19 @@ int QUICK_RANGE_SELECT::get_next_prefix(uint prefix_length,
> last_range= 0;
> DBUG_RETURN(HA_ERR_END_OF_FILE);
> }
> +
> last_range= *(cur_range++);
>
> + if (*save_last_range)
> + {
> + if (!key_tuple_cmp(key_part_info, (*save_last_range)->min_key,
> + last_range->min_key, prefix_length))
> + {
> + last_range=NULL;
> + continue;
> + }
> + }
> + *save_last_range= last_range;
> key_range start_key, end_key;
> last_range->make_min_endpoint(&start_key, prefix_length, keypart_map);
> last_range->make_max_endpoint(&end_key, prefix_length, keypart_map);
> @@ -13315,7 +13328,7 @@ QUICK_GROUP_MIN_MAX_SELECT(TABLE *table, JOIN *join_arg, bool have_min_arg,
> seen_first_key(FALSE), doing_key_read(FALSE), min_max_arg_part(min_max_arg_part_arg),
> key_infix(key_infix_arg), key_infix_len(key_infix_len_arg),
> min_functions_it(NULL), max_functions_it(NULL),
> - is_index_scan(is_index_scan_arg)
> + is_index_scan(is_index_scan_arg), save_last_range(NULL)
> {
> head= table;
> index= use_index;
> @@ -13968,7 +13981,8 @@ int QUICK_GROUP_MIN_MAX_SELECT::next_prefix()
> uchar *cur_prefix= seen_first_key ? group_prefix : NULL;
> if ((result= quick_prefix_select->get_next_prefix(group_prefix_len,
> group_key_parts,
> - cur_prefix)))
> + cur_prefix,
> + &save_last_range)))
> DBUG_RETURN(result);
> seen_first_key= TRUE;
> }
> diff --git a/sql/opt_range.h b/sql/opt_range.h
> index b8b46ae5ab1..f4fb0642590 100644
> --- a/sql/opt_range.h
> +++ b/sql/opt_range.h
> @@ -455,6 +455,7 @@ class QUICK_RANGE_SELECT : public QUICK_SELECT_I
> QUICK_RANGE **cur_range; /* current element in ranges */
>
> QUICK_RANGE *last_range;
> + QUICK_RANGE *save_last_range;
So, now both QUICK_RANGE_SELECT and QUICK_GROUP_MIN_MAX_SELECT have
'save_last_range'. Does QUICK_RANGE_SELECT need one?
>
> KEY_PART *key_parts;
> KEY_PART_INFO *key_part_info;
> @@ -477,7 +478,7 @@ class QUICK_RANGE_SELECT : public QUICK_SELECT_I
> int get_next();
> void range_end();
> int get_next_prefix(uint prefix_length, uint group_key_parts,
> - uchar *cur_prefix);
> + uchar *cur_prefix, QUICK_RANGE **save_last_range);
> bool reverse_sorted() { return 0; }
> bool unique_key_range();
> int init_ror_merged_scan(bool reuse_handler, MEM_ROOT *alloc);
> @@ -916,7 +917,8 @@ class QUICK_GROUP_MIN_MAX_SELECT : public QUICK_SELECT_I
> Use index scan to get the next different key instead of jumping into it
> through index read
> */
> - bool is_index_scan;
> + bool is_index_scan;
> + QUICK_RANGE *save_last_range;
> public:
> /*
> The following two members are public to allow easy access from
BR
Sergei
--
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog