← Back to team overview

maria-developers team mailing list archive

Cost analysis: checking if join optimization costs match the reality.

 

Hi!

I've ran this experiment: I took the scalar subquery out of the query and then
tried to compare optimizer costs with execution times. 

== The query ==
explain extended
select sql_calc_found_rows
       s_name, s_address
from nation straight_join supplier
where s_suppkey in (select ps_suppkey from partsupp
                    where ps_partkey in (select p_partkey from part
                                         where p_name like 'forest%')
                   )
and s_nationkey = n_nationkey
and n_name = 'CANADA'
order by s_name
limit 10;
+--+-----------+--------+------++-------------+-------+-------------------+----+---------------------------------------------------------+
|id|select_type|table   |type  ||key          |key_len|ref                |rows|Extra                                                    |
+--+-----------+--------+------++-------------+-------+-------------------+----+---------------------------------------------------------+
| 1|PRIMARY    |nation  |ref   ||n_name       |26     |const              |   1|Using where; Using index; Using temporary; Using filesort|
| 1|PRIMARY    |supplier|ref   ||i_s_nationkey|5      |nation.n_nationkey | 207|                                                         |
| 1|PRIMARY    |partsupp|ref   ||i_ps_suppkey |4      |supplier.s_suppkey |  50|Using index                                              |
| 1|PRIMARY    |part    |eq_ref||PRIMARY      |4      |partsupp.ps_partkey|   1|Using where; FirstMatch(supplier)                        |
+--+-----------+--------+------++-------------+-------+-------------------+----+---------------------------------------------------------+

+--+------------+-----------+------++-------------+-------+------------------+----+---------------------------------------------------------+
|id|select_type |table      |type  ||key          |key_len|ref               |rows|Extra                                                    |
+--+------------+-----------+------++-------------+-------+------------------+----+---------------------------------------------------------+
| 1|PRIMARY     |nation     |ref   ||n_name       |26     |const             |   1|Using where; Using index; Using temporary; Using filesort|
| 1|PRIMARY     |supplier   |ref   ||i_s_nationkey|5      |nation.n_nationkey| 207|                                                         |
| 1|PRIMARY     |<subquery2>|eq_ref||distinct_key |4      |func              |   1|                                                         |
| 2|MATERIALIZED|part       |range ||p_name       |58     |NULL              |2387|Using where; Using index                                 |
| 2|MATERIALIZED|partsupp   |ref   ||i_ps_partkey |4      |part.p_partkey    |   2|Using index                                              |
+--+------------+-----------+------++-------------+-------+------------------+----+---------------------------------------------------------+


=== Execution times ===
FirstMatch: 0.20 - 0.30 sec.
Materialization: 0.20 sec  - 0.30 sec

run times for both fluctuate between 0.20 and 0.30 sec, FirstMatch seems to be
somewhat slower (dunno if the difference is statistically meaningful)

== Optimizer costs ===
FirstMatch:      11029.4
Materialization:  3594.5
 
=== Read record counters ===
(source data is plan counters provided below)
FirstMatch:  43185
Materialization: 12354  + 9K temptable writes + 400 reads

=== Conclusions ===
- optimizer thinks FirstMatch is 3x more expensive.
- In reality it is *somewhat* expensive.
- Record counters say FirstMatch is 3.5x .. 1.9x more expensive, depending on
  how we treat temptable writes/reads

==> Overall, I see no big discrepancies.


==> Also, I have also discovered that JOIN::get_prefix_cost_and_fanout() 
forgets to add record_count/TIME_FOR_COMPARE, which causes (slightly) 
different cost for SJ-Materialization.  (I have noted the difference 
in some previous email)

=== Firstmatch plan counters ===
MariaDB [dbt3sf1]> show table_statistics;
+--------------+------------+-----------+--------------+-------------------------+
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes |
+--------------+------------+-----------+--------------+-------------------------+
| dbt3sf1      | nation     |         1 |            0 |                       0 |
| dbt3sf1      | part       |     21386 |            0 |                       0 |
| dbt3sf1      | partsupp   |     21386 |            0 |                       0 |
| dbt3sf1      | supplier   |       412 |            0 |                       0 |
+--------------+------------+-----------+--------------+-------------------------+
4 rows in set (0.00 sec)

MariaDB [dbt3sf1]> show index_statistics;
+--------------+------------+---------------+-----------+
| Table_schema | Table_name | Index_name    | Rows_read |
+--------------+------------+---------------+-----------+
| dbt3sf1      | nation     | n_name        |         1 |
| dbt3sf1      | part       | PRIMARY       |     21386 |
| dbt3sf1      | supplier   | i_s_nationkey |       412 |
| dbt3sf1      | partsupp   | i_ps_suppkey  |     21386 |
+--------------+------------+---------------+-----------+

Total= 43185 rows read

=== Materialization plan counters ===

MariaDB [dbt3sf1]> show index_statistics;
+--------------+------------+---------------+-----------+
| Table_schema | Table_name | Index_name    | Rows_read |
+--------------+------------+---------------+-----------+
| dbt3sf1      | nation     | n_name        |         1 |
| dbt3sf1      | part       | p_name        |      2389 |
| dbt3sf1      | supplier   | i_s_nationkey |       412 |
| dbt3sf1      | partsupp   | i_ps_partkey  |      9552 |
+--------------+------------+---------------+-----------+
4 rows in set (0.00 sec)

MariaDB [dbt3sf1]> show table_statistics;
+--------------+------------+-----------+--------------+-------------------------+
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes |
+--------------+------------+-----------+--------------+-------------------------+
| dbt3sf1      | nation     |         1 |            0 |                       0 |
| dbt3sf1      | part       |      2389 |            0 |                       0 |
| dbt3sf1      | partsupp   |      9552 |            0 |                       0 |
| dbt3sf1      | supplier   |       412 |            0 |                       0 |
+--------------+------------+-----------+--------------+-------------------------+
4 rows in set (0.00 sec)

Total = 12354 rows read
 + 9K temptable writes + 400 reads from temptable.

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
Monty Program AB, http://askmonty.org
Blog: http://s.petrunia.net/blog


Follow ups