maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #04714
Re: Review request for: [Commits] Rev 3403: Fix for bug lp:944706, task MDEV-193 in file:///home/tsk/mprog/src/5.5-lpb944706/
On Mon, May 14, 2012 at 11:22:52PM +0300, Timour Katchaounov wrote:
> > > /**
> > > + Estimate the number of rows that query execution will read.
> > > +
> > > + @todo This is a very pessimistic upper bound. Use join selectivity
> > > + when available to produce a more realistic number.
> > > +*/
> > > +
> > > +double JOIN::get_examined_rows()
> > > +{
> > > + /* Each constant table examines one row, and the result is at most one row. */
> > > + ha_rows examined_rows= const_tables;
> > > + uint i= const_tables;
> > > + double prev_fanout;
> > > +
> > > + if (table_count == const_tables)
> > > + return examined_rows;
> > > +
> > > + examined_rows+= join_tab[i++].get_examined_rows();
> > > + for (; i < table_count ; i++)
> > > + {
> > > + if (join_tab[i].type == JT_EQ_REF)
> > > + prev_fanout= 1;
> > > + else
> > > + prev_fanout= best_positions[i-1].records_read;
>
> This looks wrong. Declaration of POSITION::records_read has this comment:
>
> /*
> The "fanout": number of output rows that will be produced (after
> pushed down selection condition is applied) per each row combination of
> previous tables.
> */
>
> note the "PER EACH ROW COMBINATION .." part. I would expect that this function
> would calculate a product of records_read values.
>
> timour:
>
> In this function we want to estimate how many rows will be *examined*, not
> produced by each JOIN operator. For the estimate I assume a simple nested loops
> model, where the JOIN read every row of its right table as many times as many
> rows there are in the left operand. The partial join that serves as left
> operand, contains records_read rows. This is the multiplier of the number of
> rows that will be examined in the right table.
>
> Of course, for each partial join the join condition will filter a subset of
> these rows.
>
> I agree that blocking algorithms may examine a lot less rows, but it doesn't
> make sense to have a very tight bound here. We use an upper bound, because it
> is better to miss some constant optimizations, rather than execute very
> expensive subqueries here.
I was talking about something much more simpler than that. Consider this
example:
create table ten (a int);
insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
create table one_k (a int);
insert into one_k select A.a + 10*B.a + 100*C.a from ten A, ten B, ten C;
create table ti1 ( a int);
insert into ti1 select a from ten limit 4;
alter table ti1 add b int;
create table ti2 (a int primary key, b int);
create table ti3 (a int primary key, b int);
insert into ti2 select a, a from one_k;
insert into ti3 select a, a from one_k;
MariaDB [test]> explain select * from ten where 3 in (select ti2.b + ti3.b from ti1, ti2, ti3 where ti2.a=ti1.a and ti3.a=ti1.a) or a <5;
+------+--------------------+-------+--------+---------------+---------+---------+------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+-------+--------+---------------+---------+---------+------------+------+-------------+
| 1 | PRIMARY | ten | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
| 2 | DEPENDENT SUBQUERY | ti1 | ALL | NULL | NULL | NULL | NULL | 4 | Using where |
| 2 | DEPENDENT SUBQUERY | ti3 | eq_ref | PRIMARY | PRIMARY | 4 | test.ti1.a | 1 | |
| 2 | DEPENDENT SUBQUERY | ti2 | eq_ref | PRIMARY | PRIMARY | 4 | test.ti1.a | 1 | Using where |
+------+--------------------+-------+--------+---------------+---------+---------+------------+------+-------------+
For this example, JOIN::get_examined_rows() produces 6. The number comes from
4 rows examined in table ti1 (correct)
1 row examined in table ti2 (incorrect, should be 4)
1 row examined in table ti3 (incorrect, should be 4)
Do you agree that the number of 6 is incorrect and needs to be fixed?
BR
Sergei
--
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
References