maria-developers team mailing list archive

Thread
Date

Re: [Commits] 167a96b5157: MDEV-9959: A serious MariaDB server performance bug

To: Varun <varunraiko1803@xxxxxxxxx>
From: Sergey Petrunia <sergey@xxxxxxxxxxx>
Date: Fri, 19 Apr 2019 00:17:04 +0300
Cc: commits@xxxxxxxxxxx, maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20180327111811.05EA6101BB97B3@Varuns-MacBook-Pro.local>
User-agent: Mutt/1.9.4 (2018-02-28)

On Tue, Mar 27, 2018 at 04:48:10PM +0530, Varun wrote:
> revision-id: 167a96b5157049408a6ad4bca7abcd376af93fb5 (mariadb-10.3.0-644-g167a96b5157)
> parent(s): 4359c6b4806605c78987e50cab3a6b42016b7603
> author: Varun Gupta
> committer: Varun Gupta
> timestamp: 2018-03-27 16:45:46 +0530
> message:
> 
> MDEV-9959: A serious MariaDB server performance bug
> 
> step#1: if a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
> upper select knows that ref access to the table will produce one row.
> 
> Added handling for multiple selects in the derived table

...
> diff --git a/mysql-test/r/mdev9959.result b/mysql-test/r/mdev9959.result
> new file mode 100644
> index 00000000000..049e0350cca
> --- /dev/null
> +++ b/mysql-test/r/mdev9959.result
> @@ -0,0 +1,46 @@
> +create table t1(a int);
> +insert into t1 values (1),(2),(3),(4),(5),(6);
> +create table t2(a int, b int,c int);
> +insert into t2(a,b,c) values (1,1,2),(2,2,3),(3,1,4),(4,2,2),(5,1,1),(6,2,5);
> +create table t3(a int, b int);
> +insert into t3(a,b) values (1,1),(2,2),(2,1),(1,2),(5,1),(9,2);
> +table "<derived2>" should have type=ref and rows=1
> +one select in derived table
> +with distinct
> +analyze select * from t1 , ((select distinct t2.a from t2 order by c))q  where t1.a=q.a;
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	r_rows	filtered	r_filtered	Extra
> +1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	6	6.00	100.00	100.00	Using where
> +1	PRIMARY	<derived2>	ref	key0	key0	5	test.t1.a	1	1.00	100.00	100.00	
> +2	DERIVED	t2	ALL	NULL	NULL	NULL	NULL	6	6.00	100.00	100.00	Using temporary; Using filesort
> +# multiple selects in derived table
> +# NO UNION ALL
> +analyze select * from t1 , ( (select t2.a,t2.b from t2 order by c) union  (select t2.a,t2.b from t2 order by c))q  where t1.a=q.a;
> +id	select_type	table	type	possible_keys	key	key_len	ref	rows	r_rows	filtered	r_filtered	Extra
> +1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	6	6.00	100.00	100.00	Using where
> +1	PRIMARY	<derived2>	ref	key0	key0	5	test.t1.a	1	1.00	100.00	100.00	
> +2	DERIVED	t2	ALL	NULL	NULL	NULL	NULL	6	6.00	100.00	100.00	
> +3	UNION	t2	ALL	NULL	NULL	NULL	NULL	6	6.00	100.00	100.00	
> +NULL	UNION RESULT	<union2,3>	ALL	NULL	NULL	NULL	NULL	NULL	6.00	NULL	NULL	

How would you explain the above? The derived table is a union of two selects,
each of which produces 6 rows. Neither of the selects have DISTINCT attribute.

Why does the optimization fire and set ref=1 in this case?

(if both parts of UNION had distinct, this would make sense. But if neither
does, I don't see any logic)

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://s.petrunia.net/blog