← Back to team overview

maria-developers team mailing list archive

Updated (by Psergey): Subqueries: Inside-out execution for non-semijoin materialized subqueries that are AND-parts of the WHERE (90)

 

-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........:  Subqueries: Inside-out execution for non-semijoin materialized
		subqueries that are AND-parts of the WHERE
CREATION DATE..: Sun, 28 Feb 2010, 13:45
SUPERVISOR.....: Monty
IMPLEMENTOR....: 
COPIES TO......: Igor, Psergey, Timour
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 90 (http://askmonty.org/worklog/?tid=90)
VERSION........: Server-5.3
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: -1 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Psergey - Sun, 28 Feb 2010, 14:47)=-=-
High Level Description modified.
--- /tmp/wklog.90.old.21880     2010-02-28 14:47:28.000000000 +0000
+++ /tmp/wklog.90.new.21880     2010-02-28 14:47:28.000000000 +0000
@@ -1,10 +1,17 @@
-For uncorrelated IN subqueries that can't be converted to semi-joins it is 
-necessary to make a cost-based choice between IN->EXISTS and Materialization
-strategies.
+Consider the following case:
 
-Both strategies handle two cases:
-1. A simple case w/o NULLs handling
-2. Handling NULLs.
+SELECT * FROM big_table 
+WHERE oe IN (SELECT ie FROM table_with_few_groups
+             WHERE ...
+             GROUP BY group_col) AND ...
 
-This WL is about making cost-based decision for #1.
+Here the best way to execute the query is:
 
+  Materialize the subquery;
+  # now run the join:
+  for each record R1 in materialized table
+    for each record R2 in big_table such that oe=R1
+      pass R2 to output
+
+Semi-join materialization supports such strategy with SJM-Scan strategy. This WL
+entry is about adding support for such strategies for non-semijoin subqueries.

-=-=(Psergey - Sun, 28 Feb 2010, 14:47)=-=-
Title modified.
--- /tmp/wklog.90.old.21859     2010-02-28 14:47:02.000000000 +0000
+++ /tmp/wklog.90.new.21859     2010-02-28 14:47:02.000000000 +0000
@@ -1 +1 @@
-Subqueries: cost-based choice between Materialization and IN->EXISTS transformation
+ Subqueries: Inside-out execution for non-semijoin materialized subqueries that are AND-parts of the WHERE

-=-=(Psergey - Sun, 28 Feb 2010, 14:08)=-=-
Dependency created: 94 now depends on 90



DESCRIPTION:

Consider the following case:

SELECT * FROM big_table 
WHERE oe IN (SELECT ie FROM table_with_few_groups
             WHERE ...
             GROUP BY group_col) AND ...

Here the best way to execute the query is:
  
  Materialize the subquery;
  # now run the join:
  for each record R1 in materialized table
    for each record R2 in big_table such that oe=R1
      pass R2 to output

Semi-join materialization supports such strategy with SJM-Scan strategy. This WL
entry is about adding support for such strategies for non-semijoin subqueries.


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)