maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #04840
Cost analysis: FirstMatch plan+scalar subquery cost, prediction
On Wed, Aug 08, 2012 at 07:04:39PM +0400, Sergei Petrunia wrote:
> Please find below cost analysis for the plan with FirstMatch strategy.
>
> Debugging with optimizer_switch='semijoin=off', query:
>
> explain extended
> select sql_calc_found_rows
> s_name, s_address
> from nation straight_join supplier
> where s_suppkey in (select ps_suppkey from partsupp
> where ps_partkey in (select p_partkey from part
> where p_name like 'forest%')
> and ps_availqty >
> (select 0.5 * sum(l_quantity)
> from lineitem
> where l_partkey = ps_partkey
> and l_suppkey = ps_suppkey
> and l_shipdate >= date('1994-01-01')
> and l_shipdate < date('1994-01-01') +
> interval '1' year ))
> and s_nationkey = n_nationkey
> and n_name = 'CANADA'
> order by s_name
> limit 10;
>
> +----+------------------+--------+------++-------------------+-------+---------------------------------------+----+--------------------------------------------+
> |id |select_type |table |type ||key |key_len|ef |rows|Extra |
> +----+------------------+--------+------++-------------------+-------+---------------------------------------+----+--------------------------------------------+
> | 1|PRIMARY |nation |ALL ||NULL |NULL |NULL | 25|Using where; Using temporary; Using filesort|
> | 1|PRIMARY |supplier|ref ||i_s_nationkey |5 |nation.n_nationkey | 180| |
> | 1|PRIMARY |partsupp|ref ||i_ps_suppkey |4 |supplier.s_suppkey | 39|Using where |
> | 1|PRIMARY |part |eq_ref||PRIMARY |4 |partsupp.ps_partkey | 1|Using where; FirstMatch(supplier) |
> | 4|DEPENDENT SUBQUERY|lineitem|ref ||i_l_suppkey_partkey|10 |partsupp.ps_partkey,partsupp.ps_suppkey| 4|Using where |
> +----+------------------+--------+------++-------------------+-------+---------------------------------------+----+--------------------------------------------+
Cost of scalar subquery execution (in choose_plan()):
(gdb) p join->best_read
$144 = 4.5989999999999993
From the email "check if optimizer is able to produce a correct
E(#scalar_subquery executions)", we know:
>>Let's take estimate numbers:
>>1 * 251 * 34 * 1 * 0.0118 = 100.7012 evaluations.
which gives: $OVERALL_SCALAR_SUBQUERY_COST= 100.7 * 4.6= 463.2
> ===== 1. nation ===
> (gdb) p position->records_read
> $367 = 25
> (gdb) p position->read_time
> $368 = 1
> (gdb) p current_record_count / TIME_FOR_COMPARE
> $369 = 5
> (gdb) p current_read_time
> $370 = 6
>
> ===== 2. supplier ===
>
> (gdb) p position->records_read
> $377 = 180
> (gdb) p position->read_time
> $378 = 4525
> (gdb) p current_record_count / TIME_FOR_COMPARE
> $379 = 900
> (gdb) p current_read_time
> $380 = 5431
>
> ## here it considers "part, partsupp" extension, with cost
> ## 12,984,927. I skip that, because that's not the optimal plan
>
> ===== 3. partsupp ===
>
> (gdb) p position->records_read
> $411 = 39
> (gdb) p position->read_time
> $412 = 180000
> (gdb) p current_record_count / TIME_FOR_COMPARE
> $413 = 35100
> (gdb) p current_read_time
> $414 = 220531
>
> ===== 4. part ===
> (gdb) p position->records_read
> $421 = 1
> (gdb) p position->read_time
> $422 = 175500
> (gdb) p current_record_count / TIME_FOR_COMPARE
> $423 = 35100
> (gdb) p current_read_time
> $424 = 431131
>
> > advance_sj_state()
> > Firstmatch_picker::check_qep
> > optimize_wo_join_buffering
>
> ### It doesn't make any best_access_path() calls here, because no table is
> ### using join buffering.
>
> ### It forgets to add (current_record_count / TIME_FOR_COMPARE), and ends
> ### up with a slightly lower cost of
> (gdb) print cost
> $425 = 360931
> > optimize_wo_join_buffering
> > Firstmatch_picker::check_qep
>
> ## FirstMatch is chosen as the way we will use to eliminate duplicates..
>
> ## At the end of advance_sj_state, we have:
>
> (gdb) p *current_read_time
> $426 = 360931
> (gdb) p *current_record_count
> $427 = 4500
> > advance_sj_state().
Adding $OVERALL_SCALAR_SUBQUERY_COST:
360931 + $OVERALL_SCALAR_SUBQUERY_COST = 361394.20.
which is still much greater than the cost of Materialization+$SCALAR_SUBQ_COST,
which is 24101.8 (see prev. email).
We've got to think what else is wrong here.
BR
Sergei
--
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
References