maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #00315
Updated (by Guest): index_merge: fair choice between index_merge union and range access (24)
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: index_merge: fair choice between index_merge union and range access
CREATION DATE..: Tue, 26 May 2009, 12:10
SUPERVISOR.....: Monty
IMPLEMENTOR....: Psergey
COPIES TO......: Psergey
CATEGORY.......: Server-RawIdeaBin
TASK ID........: 24 (http://askmonty.org/worklog/?tid=24)
VERSION........: 9.x
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Guest - Mon, 01 Jun 2009, 23:30)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.21580 2009-06-01 23:30:06.000000000 +0300
+++ /tmp/wklog.24.new.21580 2009-06-01 23:30:06.000000000 +0300
@@ -64,6 +64,9 @@
* How strict is the limitation on the form of the WHERE?
+* Which version should this be based on? 5.1? Which patches are should be in
+ (google's/percona's/maria/etc?)
+
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Category updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Client-BackLog
+Server-RawIdeaBin
-=-=(Guest - Wed, 27 May 2009, 14:41)=-=-
Version updated.
--- /tmp/wklog.24.old.8414 2009-05-27 14:41:43.000000000 +0300
+++ /tmp/wklog.24.new.8414 2009-05-27 14:41:43.000000000 +0300
@@ -1 +1 @@
-Server-9.x
+9.x
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Title modified.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-index_merge optimizer: dont discard index_merge union strategies when range is available
+index_merge: fair choice between index_merge union and range access
-=-=(Guest - Wed, 27 May 2009, 13:59)=-=-
Version updated.
--- /tmp/wklog.24.old.9498 2009-05-27 13:59:23.000000000 +0300
+++ /tmp/wklog.24.new.9498 2009-05-27 13:59:23.000000000 +0300
@@ -1 +1 @@
-Benchmarks-3.0
+Server-9.x
-=-=(Guest - Tue, 26 May 2009, 13:27)=-=-
High-Level Specification modified.
--- /tmp/wklog.24.old.305 2009-05-26 13:27:32.000000000 +0300
+++ /tmp/wklog.24.new.305 2009-05-26 13:27:32.000000000 +0300
@@ -1 +1,70 @@
+(Not a ready HLS but draft)
+<contents>
+Solution overview
+Limitations
+TODO
+
+</contents>
+
+Solution overview
+=================
+The idea is to delay discarding potential index_merge plans until the point
+where it is really necessary.
+
+This way, we won't have to do much changes in the range analyzer, but will be
+able to keep potential index_merge plan just enough so that it's possible to
+take it into consideration together with range access plans.
+
+Since there are no changes in the optimizer, the ability to consider both
+range and index_merge options will be limited to WHERE clauses of this form:
+
+ WHERE := range_cond(key1_1) AND
+ range_cond(key2_1) AND
+ other_cond AND
+ index_merge_OR_cond1(key3_1, key3_2, ...)
+ index_merge_OR_cond2(key4_1, key4_2, ...)
+
+where
+
+ index_merge_OR_cond{N} := (range_cond(keyN_1) OR
+ range_cond(keyN_2) OR ...)
+
+
+ range_cond(keyX) := condition that allows to construct range access of keyX
+ and doesn't allow to construct range/index_merge accesses
+ for any keys of the table in question.
+
+
+For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
+
+ SEL_TREE(
+ range(key1_1),
+ ...
+ range(key2_1),
+ SEL_IMERGE( (1)
+ SEL_TREE(key3_1})
+ SEL_TREE(key3_2})
+ ...
+ )
+ ...
+ )
+
+which can be used to make a cost-based choice between range and index_merge.
+
+Limitations
+-----------
+This will not be a full solution in a sense that the range analyzer will not
+be able to produce sel_tree (1) if the WHERE clause is specified in other form
+(e.g. brackets were opened).
+
+TODO
+----
+* is it a problem if there are keys that are referred to both from
+ index_merge and from range access?
+
+* How strict is the limitation on the form of the WHERE?
+
+* TODO: The optimizer didn't compare costs of index_merge and range before (ok
+ it did but that was done for accesses to different tables). Will there be any
+ possible gotchas here?
DESCRIPTION:
Current range optimizer will discard possible index_merge/[sort]union
strategies when there is a possible range plan. This action is a part of
measures we take to avoid combinatorial explosion of possible range/
index_merge strategies.
A bad side effect of this is that for WHERE clauses in form
t.key1= 'very-frequent-value' AND (t.key2='rare-value1' OR t.key3='rare-value2')
the optimizer will
- discard union(key2,key3) in favor of range(key1)
- consider costs of using range(key1) and discard that plan also
and the overall effect is that possible poor range access will cause possible
good index_merge access not to be considered.
This WL is to about lifting this limitation at least for some subset of WHERE
clauses.
HIGH-LEVEL SPECIFICATION:
(Not a ready HLS but draft)
<contents>
Solution overview
Limitations
TODO
</contents>
Solution overview
=================
The idea is to delay discarding potential index_merge plans until the point
where it is really necessary.
This way, we won't have to do much changes in the range analyzer, but will be
able to keep potential index_merge plan just enough so that it's possible to
take it into consideration together with range access plans.
Since there are no changes in the optimizer, the ability to consider both
range and index_merge options will be limited to WHERE clauses of this form:
WHERE := range_cond(key1_1) AND
range_cond(key2_1) AND
other_cond AND
index_merge_OR_cond1(key3_1, key3_2, ...)
index_merge_OR_cond2(key4_1, key4_2, ...)
where
index_merge_OR_cond{N} := (range_cond(keyN_1) OR
range_cond(keyN_2) OR ...)
range_cond(keyX) := condition that allows to construct range access of keyX
and doesn't allow to construct range/index_merge accesses
for any keys of the table in question.
For such WHERE clauses, the range analyzer will produce SEL_TREE of this form:
SEL_TREE(
range(key1_1),
...
range(key2_1),
SEL_IMERGE( (1)
SEL_TREE(key3_1})
SEL_TREE(key3_2})
...
)
...
)
which can be used to make a cost-based choice between range and index_merge.
Limitations
-----------
This will not be a full solution in a sense that the range analyzer will not
be able to produce sel_tree (1) if the WHERE clause is specified in other form
(e.g. brackets were opened).
TODO
----
* is it a problem if there are keys that are referred to both from
index_merge and from range access?
* How strict is the limitation on the form of the WHERE?
* Which version should this be based on? 5.1? Which patches are should be in
(google's/percona's/maria/etc?)
* TODO: The optimizer didn't compare costs of index_merge and range before (ok
it did but that was done for accesses to different tables). Will there be any
possible gotchas here?
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)