maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #03271
Updated (by Psergey): Add DS-MRR support for clustered primary keys (121)
-----------------------------------------------------------------------
WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add DS-MRR support for clustered primary keys
CREATION DATE..: Sat, 12 Jun 2010, 08:23
SUPERVISOR.....: Igor
IMPLEMENTOR....:
COPIES TO......:
CATEGORY.......: Client-BackLog
TASK ID........: 121 (http://askmonty.org/worklog/?tid=121)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0
PROGRESS NOTES:
-=-=(Psergey - Sun, 13 Jun 2010, 11:56)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.9396 2010-06-13 11:56:34.000000000 +0000
+++ /tmp/wklog.121.new.9396 2010-06-13 11:56:34.000000000 +0000
@@ -1,17 +1,55 @@
-Basic idea: DS-MRR scan should be done as follows:
+1. Choices to be made
+---------------------
-1. Sort incoming keys
-2. Use the sorted keys to do a disk-ordered retrieval
+1.1 Handling of complex ranges
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The "sort incoming keys" part is easy when we have only equality ranges.
+If we allow ranges of arbitrary form (including ranges with one endpoint
+being infinity and/or ranges overlapping with one another), sorting becomes
+non-trivial. Do we need to support this case or support only equality ranges?
-Unresolved questions:
+Decision: the new code should handle only the case with equality ranges.
+For non-equality ranges, the execution will proceed as before.
-* The "sort incoming keys" part is trivial when we have only equality ranges.
- If we allow ranges of arbitrary form ( including ranges with one endpoint
- being infinity or ranges overlapping with one another), sorting becomes
- non-trivial. Do we need to support this case or support only equality ranges?
+1.2 Handling index prefix scans
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+What do we do if asked to do a scan on a prefix of clustered PK?
-* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
- PKs? (current decision: No)
+Decision: handle this if the ranges are equality ranges. The difference from
+scan on full primary key is that
+- we will have to use read_range_XXX() or index_read()/index_next_same()
+ functions, while for full primary key value we could have used rnd_pos().
+- One equality range can produce multiple matching records.
-* Do we support scanning on a prefix of clustered PK? (Yes but scan on prefix
- of clustered PK will not be in disk order. We need to run it with regular mode)
+1.3 Use of knowledge that primary_key==rowid
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
+PKs?
+Decision: don't make this assumption.
+
+
+2. Code-level changes overview
+------------------------------
+
+DsMrr_impl::choose_mrr_impl():
+Enable MRR when
+ - ihis is a clustered primary key
+ - incoming ranges are single-point (HA_MRR_SINGLE_POINT is set)
+ - will need to make the SQL layer to set this flag
+ - incoming ranges are not already sorted (HA_MRR_SORTED is not set)
+
+(TODO do we need new cost formula?)
+
+DsMrr_impl::dsmrr_init()
+ - different elem_size (not rowid length but key tuple length)
+ - don't create the secondary handler object, we won't need it.
+
+DsMrr_impl::dsmrr_fill_buffer():
+ - need a variant of this function that would not access the index but just
+ fill and sort the array.
+
+DsMrr_impl::dsmrr_next():
+ - should abstract-out:
+ - buffer element size
+ - rnd_pos/index_read call.
+ - Also for CPK prefix scans there can be multi
-=-=(Psergey - Sun, 13 Jun 2010, 11:55)=-=-
High Level Description modified.
--- /tmp/wklog.121.old.9380 2010-06-13 11:55:42.000000000 +0000
+++ /tmp/wklog.121.new.9380 2010-06-13 11:55:42.000000000 +0000
@@ -1,18 +1,18 @@
-Currently, DS-MRR doesn't support operation over clustered primary keys. The
-reason for this was that
- - Clustered primary keys are stored in disk order and so, if the sequence of
- ranges is ordered, the reads will already go in disk order (and so DS-MRR's
- step of re-ordering reads is not necessary).
+Currently, DS-MRR doesn't allow to do MRR scans over clustered primary keys. The
+reason for this is that
+ - Clustered primary keys are stored in disk order, so, if the sequence of
+ scanned ranges is ordered, the reads will automatically happen in disk
+ order, and DS-MRR's step of re-ordering reads is redundant.
- Within DS-MRR implementation, the "get rowids from keys" step is not
necessary when using clustered primary key, because in InnoDB/XtraDB
clustered primary key values are the rowids.
-However, with BKA making the MRR calls, there are cases where DS-MRR over
+However, when MRR calls are made by BKA, there are cases where DS-MRR over
clustered primary key is beneficial:
* BKA may provide lookup keys that have duplicates and/or are in arbitrary
order. In that case, DS-MRR implementation may sort the key values and
- order them, so that it hits the disk in key order.
+ order them, so that it hits the disk in key(=disk) order.
* When running multi-table join with high @@join_cache_level value (and so,
linked join buffers), lack of MRR implementation causes the chain of linked
@@ -20,3 +20,9 @@
* TODO anything else?
+This WL entry is about addressing the above by adding support of DS-MRR over
+clustered primary key that would work according to this strategy:
+1. Sort incoming keys
+2. Use the sorted keys to do a disk-ordered retrieval.
+
+
-=-=(Psergey - Sun, 13 Jun 2010, 09:42)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.5009 2010-06-13 09:42:38.000000000 +0000
+++ /tmp/wklog.121.new.5009 2010-06-13 09:42:38.000000000 +0000
@@ -6,11 +6,12 @@
Unresolved questions:
* The "sort incoming keys" part is trivial when we have only equality ranges.
- If we allow ranges of arbitrary form ( ncluding ranges with one endpoint
+ If we allow ranges of arbitrary form ( including ranges with one endpoint
being infinity or ranges overlapping with one another), sorting becomes
- non-trival. Do we need to support this case or support only equality ranges?
+ non-trivial. Do we need to support this case or support only equality ranges?
* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
- PKs?
+ PKs? (current decision: No)
-* Do we support scanning on a prefix of clustered PK? (seems to be yes?)
+* Do we support scanning on a prefix of clustered PK? (Yes but scan on prefix
+ of clustered PK will not be in disk order. We need to run it with regular mode)
-=-=(Psergey - Sat, 12 Jun 2010, 08:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.538 2010-06-12 08:39:46.000000000 +0000
+++ /tmp/wklog.121.new.538 2010-06-12 08:39:46.000000000 +0000
@@ -1 +1,16 @@
+Basic idea: DS-MRR scan should be done as follows:
+1. Sort incoming keys
+2. Use the sorted keys to do a disk-ordered retrieval
+
+Unresolved questions:
+
+* The "sort incoming keys" part is trivial when we have only equality ranges.
+ If we allow ranges of arbitrary form ( ncluding ranges with one endpoint
+ being infinity or ranges overlapping with one another), sorting becomes
+ non-trival. Do we need to support this case or support only equality ranges?
+
+* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
+ PKs?
+
+* Do we support scanning on a prefix of clustered PK? (seems to be yes?)
DESCRIPTION:
Currently, DS-MRR doesn't allow to do MRR scans over clustered primary keys. The
reason for this is that
- Clustered primary keys are stored in disk order, so, if the sequence of
scanned ranges is ordered, the reads will automatically happen in disk
order, and DS-MRR's step of re-ordering reads is redundant.
- Within DS-MRR implementation, the "get rowids from keys" step is not
necessary when using clustered primary key, because in InnoDB/XtraDB
clustered primary key values are the rowids.
However, when MRR calls are made by BKA, there are cases where DS-MRR over
clustered primary key is beneficial:
* BKA may provide lookup keys that have duplicates and/or are in arbitrary
order. In that case, DS-MRR implementation may sort the key values and
order them, so that it hits the disk in key(=disk) order.
* When running multi-table join with high @@join_cache_level value (and so,
linked join buffers), lack of MRR implementation causes the chain of linked
join buffers to break. (TODO and so what? Is that really a problem?)
* TODO anything else?
This WL entry is about addressing the above by adding support of DS-MRR over
clustered primary key that would work according to this strategy:
1. Sort incoming keys
2. Use the sorted keys to do a disk-ordered retrieval.
HIGH-LEVEL SPECIFICATION:
1. Choices to be made
---------------------
1.1 Handling of complex ranges
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The "sort incoming keys" part is easy when we have only equality ranges.
If we allow ranges of arbitrary form (including ranges with one endpoint
being infinity and/or ranges overlapping with one another), sorting becomes
non-trivial. Do we need to support this case or support only equality ranges?
Decision: the new code should handle only the case with equality ranges.
For non-equality ranges, the execution will proceed as before.
1.2 Handling index prefix scans
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What do we do if asked to do a scan on a prefix of clustered PK?
Decision: handle this if the ranges are equality ranges. The difference from
scan on full primary key is that
- we will have to use read_range_XXX() or index_read()/index_next_same()
functions, while for full primary key value we could have used rnd_pos().
- One equality range can produce multiple matching records.
1.3 Use of knowledge that primary_key==rowid
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
PKs?
Decision: don't make this assumption.
2. Code-level changes overview
------------------------------
DsMrr_impl::choose_mrr_impl():
Enable MRR when
- ihis is a clustered primary key
- incoming ranges are single-point (HA_MRR_SINGLE_POINT is set)
- will need to make the SQL layer to set this flag
- incoming ranges are not already sorted (HA_MRR_SORTED is not set)
(TODO do we need new cost formula?)
DsMrr_impl::dsmrr_init()
- different elem_size (not rowid length but key tuple length)
- don't create the secondary handler object, we won't need it.
DsMrr_impl::dsmrr_fill_buffer():
- need a variant of this function that would not access the index but just
fill and sort the array.
DsMrr_impl::dsmrr_next():
- should abstract-out:
- buffer element size
- rnd_pos/index_read call.
- Also for CPK prefix scans there can be multi
ESTIMATED WORK TIME
ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)