maria-developers team mailing list archive

Thread
Date
Updated (by Psergey): subquery optimizations: Use cache for correlated non-semijoin subqueries (18)

To: sergey@xxxxxxxxxxxx
From: worklog-noreply@xxxxxxxxxxxx
Date: Mon, 11 Jan 2010 11:22:35 +0000 (UTC)
Delivered-to: worklog-db@xxxxxxxxxxxx
-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: subquery optimizations: Use cache for correlated non-semijoin
		subqueries
CREATION DATE..: Mon, 11 May 2009, 19:03
SUPERVISOR.....: Monty
IMPLEMENTOR....: 
COPIES TO......: 
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 18 (http://askmonty.org/worklog/?tid=18)
VERSION........: Server-9.x
STATUS.........: Cancelled
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Psergey - Mon, 11 Jan 2010, 13:22)=-=-
High Level Description modified.
--- /tmp/wklog.18.old.24759     2010-01-11 11:22:35.000000000 +0000
+++ /tmp/wklog.18.new.24759     2010-01-11 11:22:35.000000000 +0000
@@ -1,3 +1,5 @@
+(( This WL entry is superseded by MWL#66 ))
+
 Suppose there is a subquery that is correlated, and is not in the WHERE clause
 e.g. 
 

-=-=(Guest - Mon, 11 Jan 2010, 13:21)=-=-
Status updated.
--- /tmp/wklog.18.old.24677     2010-01-11 13:21:24.000000000 +0200
+++ /tmp/wklog.18.new.24677     2010-01-11 13:21:24.000000000 +0200
@@ -1 +1 @@
-Un-Assigned
+Cancelled

-=-=(Psergey - Mon, 11 May 2009, 19:05)=-=-
High-Level Specification modified.
--- /tmp/wklog.18.old.1385      2009-05-11 19:05:55.000000000 +0300
+++ /tmp/wklog.18.new.1385      2009-05-11 19:05:55.000000000 +0300
@@ -1 +1,12 @@
+Not a spec but some considerations:
 
+* eq_ref access has one-element "lookup cache"
+
+* Materialization optimization (WL#1110) also has lookup cache, see
+subselect_hash_semi_join_engine::test_if_l_operand_changed.
+
+The tricky parts are:
+
+1. Assemble a list of correlation references
+
+2. Determine cache size.



DESCRIPTION:

(( This WL entry is superseded by MWL#66 ))

Suppose there is a subquery that is correlated, and is not in the WHERE clause
e.g. 

SELECT ..., (SELECT ... FROM t2 WHERE t2.col=t1.col) FROM t1; 

or 

SELECT ..., left_expr IN (SELECT ... FROM t2 WHERE t2.col=t1.col) FROM t1; 

In this case, the only strategy is to re-evaluate the predicate for every record
combination of the outer select.  This can be improved if we catch all
correlation references and implement a lookup cache:

  (left_expr, outer_ref_value1, ... outer_ref_valueN) -> predicate_value

it's hard to predict what will be the optimal size of the cache but one-element
cache will most certainly be worth it.


HIGH-LEVEL SPECIFICATION:



Not a spec but some considerations:

* eq_ref access has one-element "lookup cache"

* Materialization optimization (WL#1110) also has lookup cache, see
subselect_hash_semi_join_engine::test_if_l_operand_changed.

The tricky parts are:

1. Assemble a list of correlation references

2. Determine cache size.


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)