← Back to team overview

maria-developers team mailing list archive

Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)

 

-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414      2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414      2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin

-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414      2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414      2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x

-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498      2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498      2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access

-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Version updated.
--- /tmp/wklog.24.old.9498      2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498      2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x

-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305       2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305       2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point 
+where it is really necessary. 
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to 
+take it into consideration together with range access plans. 
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND 
+          range_cond(key2_1) AND
+          other_cond       AND 
+          index_merge_OR_cond1(key3_1, key3_2, ...)
+          index_merge_OR_cond2(key4_1, key4_2, ...)
+  
+where 
+  
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR 
+                            range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+                     and doesn't allow to construct range/index_merge accesses
+                     for any keys of the table in question. 
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+  SEL_TREE(
+    range(key1_1),
+    ...
+    range(key2_1),
+    SEL_IMERGE(                         (1)
+      SEL_TREE(key3_1})
+      SEL_TREE(key3_2})
+      ...
+    )
+    ...
+  )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+  index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok 
+  it did but that was done for accesses to different tables). Will there be any
+  possible gotchas here?
 



DESCRIPTION:

Current range optimizer will discard possible index_merge/[sort]union 
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.

A bad side effect of this is that for WHERE clauses  in form 

t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')

the optimizer will 
 - discard union(key2,key3) in favor of range(key1)
 - consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.

This WL is to about lifting this limitation at least for some subset of WHERE
clauses.


HIGH-LEVEL SPECIFICATION:



(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO

</contents>

Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point 
where it is really necessary. 

This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to 
take it into consideration together with range access plans. 

Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:

 WHERE := range_cond(key1_1) AND 
          range_cond(key2_1) AND
          other_cond       AND 
          index_merge_OR_cond1(key3_1, key3_2, ...)
          index_merge_OR_cond2(key4_1, key4_2, ...)
  
where 
  
 index_merge_OR_cond{N} := (range_cond(keyN_1) OR 
                            range_cond(keyN_2) OR ...)


 range_cond(keyX) := condition that allows to construct range access of keyX
                     and doesn't allow to construct range/index_merge accesses
                     for any keys of the table in question. 


For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:

  SEL_TREE(
    range(key1_1),
    ...
    range(key2_1),
    SEL_IMERGE(                         (1)
      SEL_TREE(key3_1})
      SEL_TREE(key3_2})
      ...
    )
    ...
  )

which can be used to make a cost-based choice between range and index_merge.

Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).

TODO
----
* is it a problem if there are keys that are referred to both from
  index_merge and from range access?

* How strict is the limitation on the form of the WHERE?

* TODO: The optimizer didn't compare costs of index_merge and range before (ok 
  it did but that was done for accesses to different tables). Will there be any
  possible gotchas here?


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)