maria-developers team mailing list archive

Thread
Date
Cost analysis: FirstMatch plan

To: timour@xxxxxxxxxxxx
From: Sergei Petrunia <psergey@xxxxxxxxxxxx>
Date: Wed, 8 Aug 2012 19:04:39 +0400
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
Hi Timour, 

Please find below cost analysis for the plan with FirstMatch strategy.

Debugging with optimizer_switch='semijoin=off', query:

explain extended
select sql_calc_found_rows
       s_name, s_address
from nation straight_join supplier
where s_suppkey in (select ps_suppkey from partsupp
                    where ps_partkey in (select p_partkey from part
                                         where p_name like 'forest%')
                          and ps_availqty >
                              (select 0.5 * sum(l_quantity)
                               from lineitem
                               where l_partkey = ps_partkey
                                     and l_suppkey = ps_suppkey
                                     and l_shipdate >= date('1994-01-01')
                                     and l_shipdate < date('1994-01-01') +
                                         interval '1' year ))
and s_nationkey = n_nationkey
and n_name = 'CANADA'
order by s_name
limit 10;

+----+------------------+--------+------++-------------------+-------+---------------------------------------+----+--------------------------------------------+
|id  |select_type       |table   |type  ||key                |key_len|ef                                     |rows|Extra                                       |
+----+------------------+--------+------++-------------------+-------+---------------------------------------+----+--------------------------------------------+
|   1|PRIMARY           |nation  |ALL   ||NULL               |NULL   |NULL                                   |  25|Using where; Using temporary; Using filesort|
|   1|PRIMARY           |supplier|ref   ||i_s_nationkey      |5      |nation.n_nationkey                     | 180|                                            |
|   1|PRIMARY           |partsupp|ref   ||i_ps_suppkey       |4      |supplier.s_suppkey                     |  39|Using where                                 |
|   1|PRIMARY           |part    |eq_ref||PRIMARY            |4      |partsupp.ps_partkey                    |   1|Using where; FirstMatch(supplier)           |
|   4|DEPENDENT SUBQUERY|lineitem|ref   ||i_l_suppkey_partkey|10     |partsupp.ps_partkey,partsupp.ps_suppkey|   4|Using where                                 |
+----+------------------+--------+------++-------------------+-------+---------------------------------------+----+--------------------------------------------+

===== 1. nation === 
(gdb) p position->records_read 
  $367 = 25
(gdb) p position->read_time
  $368 = 1
(gdb) p current_record_count / TIME_FOR_COMPARE 
  $369 = 5
(gdb) p current_read_time
  $370 = 6

===== 2. supplier === 

(gdb) p position->records_read
  $377 = 180
(gdb) p position->read_time
  $378 = 4525
(gdb) p current_record_count / TIME_FOR_COMPARE
  $379 = 900
(gdb) p current_read_time
  $380 = 5431

## here it considers "part, partsupp" extension, with cost 
## 12,984,927. I skip that, because that's not the optimal plan

===== 3. partsupp === 

(gdb) p position->records_read
  $411 = 39
(gdb) p position->read_time
  $412 = 180000
(gdb) p current_record_count / TIME_FOR_COMPARE
  $413 = 35100
(gdb) p current_read_time
  $414 = 220531
 
===== 4. part === 
(gdb) p position->records_read
  $421 = 1
(gdb) p position->read_time
  $422 = 175500
(gdb) p current_record_count / TIME_FOR_COMPARE
  $423 = 35100
(gdb) p current_read_time
  $424 = 431131

> advance_sj_state()
  > Firstmatch_picker::check_qep
    > optimize_wo_join_buffering

     ### It doesn't make any best_access_path() calls here, because no table is
     ###  using join buffering.

     ### It forgets to add (current_record_count / TIME_FOR_COMPARE), and ends
     ###   up with a slightly lower cost of
     (gdb) print cost
     $425 = 360931
    > optimize_wo_join_buffering
  > Firstmatch_picker::check_qep

  ## FirstMatch is chosen as the way we will use to eliminate duplicates..
  
  ## At the end of advance_sj_state, we have:

  (gdb) p *current_read_time
    $426 = 360931
  (gdb) p *current_record_count
    $427 = 4500
> advance_sj_state().


And this is what is chosen as the final query plan.

-- 
BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog
Follow ups

Cost analysis: FirstMatch plan+scalar subquery cost, prediction
From: Sergei Petrunia, 2012-08-09
Re: Cost analysis: FirstMatch plan
From: Sergei Petrunia, 2012-08-08