← Back to team overview

maria-developers team mailing list archive

Updated (by Guest): Make EXPLAIN always show materialization separately (110)

 

-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Make EXPLAIN always show materialization separately
CREATION DATE..: Mon, 29 Mar 2010, 06:45
SUPERVISOR.....: Igor
IMPLEMENTOR....: Psergey
COPIES TO......: 
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 110 (http://askmonty.org/worklog/?tid=110)
VERSION........: Server-5.3
STATUS.........: Complete
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Guest - Mon, 29 Mar 2010, 14:09)=-=-
Status updated.
--- /tmp/wklog.110.old.11193    2010-03-29 14:09:27.000000000 +0000
+++ /tmp/wklog.110.new.11193    2010-03-29 14:09:27.000000000 +0000
@@ -1 +1 @@
-Un-Assigned
+Complete

-=-=(Psergey - Mon, 29 Mar 2010, 06:46)=-=-
Low Level Design modified.
--- /tmp/wklog.110.old.11745    2010-03-29 06:46:30.000000000 +0000
+++ /tmp/wklog.110.new.11745    2010-03-29 06:46:30.000000000 +0000
@@ -1 +1,8 @@
+For now, all changes will be in select_describe():
+- In the for-each-join-table loop, when we've reached a line where we would
+  have printed "[Start ]Materialize;" in Extra column, remember the table 
+  number, and emit a materialized table access line instead
+- After the loop, do another loop over remembered materialization nests and
+  print them (a possible difficulty: do we remember what select# they are
+  from?)
 

-=-=(Psergey - Mon, 29 Mar 2010, 06:46)=-=-
High-Level Specification modified.
--- /tmp/wklog.110.old.11654    2010-03-29 06:46:19.000000000 +0000
+++ /tmp/wklog.110.new.11654    2010-03-29 06:46:19.000000000 +0000
@@ -1 +1,15 @@
+Materialized table access line will look as follows:
 
+Table name
+----------
+* Table name will be "SUBQUERY#%d" where %d will refer to the id of first
+  select in the subquery (when the subquery is a UNION it would be better
+  to refe to the union-operation line but it has id=NULL so it's not easy
+  to refer to it)
+
+Access method
+-------------
+* SJ-Materialization-lookup will have eq_ref on 'distinct_key'
+* SJ-Materialization-scan  will have access method ALL, with #rows being
+  expected number of records in the temp table (i.e. after duplicates are
+  removed)

-=-=(Psergey - Mon, 29 Mar 2010, 06:46)=-=-
Category updated.
--- /tmp/wklog.110.old.11639    2010-03-29 06:46:02.000000000 +0000
+++ /tmp/wklog.110.new.11639    2010-03-29 06:46:02.000000000 +0000
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin

-=-=(Psergey - Mon, 29 Mar 2010, 06:46)=-=-
Version updated.
--- /tmp/wklog.110.old.11639    2010-03-29 06:46:02.000000000 +0000
+++ /tmp/wklog.110.new.11639    2010-03-29 06:46:02.000000000 +0000
@@ -1 +1 @@
-Benchmarks-3.0
+9.x

-=-=(Psergey - Mon, 29 Mar 2010, 06:46)=-=-
Version updated.
--- /tmp/wklog.110.old.11639    2010-03-29 06:46:02.000000000 +0000
+++ /tmp/wklog.110.new.11639    2010-03-29 06:46:02.000000000 +0000
@@ -1 +1 @@
-9.x
+Server-5.3



DESCRIPTION:

At the moment, SJM-Materialization is shown in EXPLAIN output in this way:

MariaDB [j45]> explain select * from ot where a in (select b from it1);
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
| Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | PRIMARY     | ot    | ALL  | NULL          | NULL | NULL    | NULL |   10
|             |
|  1 | PRIMARY     | it1   | ALL  | NULL          | NULL | NULL    | NULL |   10
| Materialize |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+

MariaDB [j45]> explain select * from ot where a in (select it1.b from it1, it2);
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
| Extra                              |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+
|  1 | PRIMARY     | ot    | ALL  | NULL          | NULL | NULL    | NULL |   10
|                                    |
|  1 | PRIMARY     | it1   | ALL  | NULL          | NULL | NULL    | NULL |   10
| Start materialize                  |
|  1 | PRIMARY     | it2   | ALL  | NULL          | NULL | NULL    | NULL |   10
| End materialize; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------+

This WL task is to change the output format so it will look as follows:
- Tables inside the SJM-nest are displayed as a separate select
- within the master select, there is a line that denotes SJM-lookup or
  SJM-Scan.

The above examples will look as follows:

MariaDB [j45]> explain select * from ot where a in (select b from it1);
+----+-------------+------------+--------+---------------+--------------+---------+----------+------+-------+
| id | select_type | table      | type   | possible_keys | key          |
key_len | ref      | rows | Extra |
+----+-------------+------------+--------+---------------+--------------+---------+----------+------+-------+
|  1 | PRIMARY     | ot         | ALL    | NULL          | NULL         | NULL 
  | NULL     |   10 |       |
|  1 | PRIMARY     | SUBQUERY#2 | eq_ref | distinct_key  | distinct_key | 5    
  | j45.ot.a |    1 |       |
|  2 | SUBQUERY    | it1        | ALL    | NULL          | NULL         | NULL 
  | NULL     |   10 |       |
+----+-------------+------------+--------+---------------+--------------+---------+----------+------+-------+

MariaDB [j45]> explain select * from ot where a in (select it1.b from it1, it2);
+----+-------------+------------+--------+---------------+--------------+---------+----------+------+-------------------+
| id | select_type | table      | type   | possible_keys | key          |
key_len | ref      | rows | Extra             |
+----+-------------+------------+--------+---------------+--------------+---------+----------+------+-------------------+
|  1 | PRIMARY     | ot         | ALL    | NULL          | NULL         | NULL 
  | NULL     |   10 |                   |
|  1 | PRIMARY     | SUBQUERY#2 | eq_ref | distinct_key  | distinct_key | 5    
  | j45.ot.a |    1 |                   |
|  2 | SUBQUERY    | it1        | ALL    | NULL          | NULL         | NULL 
  | NULL     |   10 |                   |
|  2 | SUBQUERY    | it2        | ALL    | NULL          | NULL         | NULL 
  | NULL     |   10 | Using join buffer |
+----+-------------+------------+--------+---------------+--------------+---------+----------+------+-------------------+

The rationale behind the change is:
- Unification of EXPLAIN output with MWL#90
- The new format is more natural representation of what is going on,
  conceptually-wise (and may be soon be code-wise)
- The new format allows to display E(#records-in-temp-table) for the 
  SJM-Scan case (and for SJM-lookup that number doesn't matter that much)
- The new format doesn't put anything into "Extra" column and that's good 
  because that column is already overloaded and horizontal screen space is
  precious (while vertical is not so much).


HIGH-LEVEL SPECIFICATION:



Materialized table access line will look as follows:

Table name
----------
* Table name will be "SUBQUERY#%d" where %d will refer to the id of first
  select in the subquery (when the subquery is a UNION it would be better
  to refe to the union-operation line but it has id=NULL so it's not easy
  to refer to it)

Access method
-------------
* SJ-Materialization-lookup will have eq_ref on 'distinct_key'
* SJ-Materialization-scan  will have access method ALL, with #rows being
  expected number of records in the temp table (i.e. after duplicates are
  removed)


LOW-LEVEL DESIGN:



For now, all changes will be in select_describe():
- In the for-each-join-table loop, when we've reached a line where we would
  have printed "[Start ]Materialize;" in Extra column, remember the table 
  number, and emit a materialized table access line instead
- After the loop, do another loop over remembered materialization nests and
  print them (a possible difficulty: do we remember what select# they are
  from?)


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)