maria-developers team mailing list archive

Thread
Date
Updated (by Psergey): Add DS-MRR support for clustered primary keys (121)

To: sergey@xxxxxxxxxxxx
From: worklog-noreply@xxxxxxxxxxxx
Date: Sun, 13 Jun 2010 11:56:34 +0000 (UTC)
Delivered-to: worklog-db@xxxxxxxxxxxx
-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add DS-MRR support for clustered primary keys
CREATION DATE..: Sat, 12 Jun 2010, 08:23
SUPERVISOR.....: Igor
IMPLEMENTOR....: 
COPIES TO......: 
CATEGORY.......: Client-BackLog
TASK ID........: 121 (http://askmonty.org/worklog/?tid=121)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Psergey - Sun, 13 Jun 2010, 11:56)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.9396     2010-06-13 11:56:34.000000000 +0000
+++ /tmp/wklog.121.new.9396     2010-06-13 11:56:34.000000000 +0000
@@ -1,17 +1,55 @@
-Basic idea: DS-MRR scan should be done as follows:
+1. Choices to be made
+---------------------
 
-1. Sort incoming keys 
-2. Use the sorted keys to do a disk-ordered retrieval
+1.1 Handling of complex ranges
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The "sort incoming keys" part is easy when we have only equality ranges.
+If we allow ranges of arbitrary form (including ranges with one endpoint
+being infinity and/or ranges overlapping with one another), sorting becomes
+non-trivial. Do we need to support this case or support only equality ranges?
 
-Unresolved questions:
+Decision: the new code should handle only the case with equality ranges.
+For non-equality ranges, the execution will proceed as before.
 
-* The "sort incoming keys" part is trivial when we have only equality ranges.
-  If we allow ranges of arbitrary form ( including ranges with one endpoint 
-  being infinity or ranges overlapping with one another), sorting becomes
-  non-trivial. Do we need to support this case or support only equality ranges?
+1.2 Handling index prefix scans
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+What do we do if asked to do a scan on a prefix of clustered PK? 
 
-* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
-  PKs? (current decision: No)
+Decision: handle this if the ranges are equality ranges. The difference from
+scan on full primary key is that 
+- we will have to use read_range_XXX() or index_read()/index_next_same() 
+  functions, while for full primary key value we could have used rnd_pos().
+- One equality range can produce multiple matching records.
 
-* Do we support scanning on a prefix of clustered PK? (Yes but scan on prefix 
-  of clustered PK will not be in disk order. We need to run it with regular mode)
+1.3 Use of knowledge that primary_key==rowid
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
+PKs? 
+Decision: don't make this assumption.
+
+
+2. Code-level changes overview
+------------------------------
+
+DsMrr_impl::choose_mrr_impl():
+Enable MRR when 
+  - ihis is a clustered primary key
+  - incoming ranges are single-point (HA_MRR_SINGLE_POINT is set)
+     - will need to make the SQL layer to set this flag
+  - incoming ranges are not already sorted (HA_MRR_SORTED is not set)
+
+(TODO do we need new cost formula?)
+
+DsMrr_impl::dsmrr_init() 
+ - different elem_size (not rowid length but key tuple length)
+ - don't create the secondary handler object, we won't need it.
+
+DsMrr_impl::dsmrr_fill_buffer():
+ - need a variant of this function that would not access the index but just
+   fill and sort the array.
+
+DsMrr_impl::dsmrr_next():
+ - should abstract-out:
+   - buffer element size
+   - rnd_pos/index_read call.
+      - Also for CPK prefix scans there can be multi

-=-=(Psergey - Sun, 13 Jun 2010, 11:55)=-=-
High Level Description modified.
--- /tmp/wklog.121.old.9380     2010-06-13 11:55:42.000000000 +0000
+++ /tmp/wklog.121.new.9380     2010-06-13 11:55:42.000000000 +0000
@@ -1,18 +1,18 @@
-Currently, DS-MRR doesn't support operation over clustered primary keys. The
-reason for this was that
- - Clustered primary keys are stored in disk order and so, if the sequence of 
-   ranges is ordered, the reads will already go in disk order (and so DS-MRR's
-   step of re-ordering reads is not necessary).
+Currently, DS-MRR doesn't allow to do MRR scans over clustered primary keys. The
+reason for this is that
+ - Clustered primary keys are stored in disk order, so, if the sequence of
+   scanned ranges is ordered, the reads will automatically happen in disk
+   order, and DS-MRR's step of re-ordering reads is redundant.
  - Within DS-MRR implementation, the "get rowids from keys" step is not
    necessary when using clustered primary key, because in InnoDB/XtraDB 
    clustered primary key values are the rowids.
 
-However, with BKA making the MRR calls, there are cases where DS-MRR over
+However, when MRR calls are made by BKA, there are cases where DS-MRR over
 clustered primary key is beneficial:
 
 * BKA may provide lookup keys that have duplicates and/or are in arbitrary
   order. In that case, DS-MRR implementation may sort the key values and 
-  order them, so that it hits the disk in key order.
+  order them, so that it hits the disk in key(=disk) order.
 
 * When running multi-table join with high @@join_cache_level value (and so,
   linked join buffers), lack of MRR implementation causes the chain of linked
@@ -20,3 +20,9 @@
 
 * TODO anything else?
 
+This WL entry is about addressing the above by adding support of DS-MRR over
+clustered primary key that would work according to this strategy:
+1. Sort incoming keys 
+2. Use the sorted keys to do a disk-ordered retrieval.
+
+

-=-=(Psergey - Sun, 13 Jun 2010, 09:42)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.5009     2010-06-13 09:42:38.000000000 +0000
+++ /tmp/wklog.121.new.5009     2010-06-13 09:42:38.000000000 +0000
@@ -6,11 +6,12 @@
 Unresolved questions:
 
 * The "sort incoming keys" part is trivial when we have only equality ranges.
-  If we allow ranges of arbitrary form ( ncluding ranges with one endpoint 
+  If we allow ranges of arbitrary form ( including ranges with one endpoint 
   being infinity or ranges overlapping with one another), sorting becomes
-  non-trival. Do we need to support this case or support only equality ranges?
+  non-trivial. Do we need to support this case or support only equality ranges?
 
 * Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
-  PKs?
+  PKs? (current decision: No)
 
-* Do we support scanning on a prefix of clustered PK? (seems to be yes?)
+* Do we support scanning on a prefix of clustered PK? (Yes but scan on prefix 
+  of clustered PK will not be in disk order. We need to run it with regular mode)

-=-=(Psergey - Sat, 12 Jun 2010, 08:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.538      2010-06-12 08:39:46.000000000 +0000
+++ /tmp/wklog.121.new.538      2010-06-12 08:39:46.000000000 +0000
@@ -1 +1,16 @@
+Basic idea: DS-MRR scan should be done as follows:
 
+1. Sort incoming keys 
+2. Use the sorted keys to do a disk-ordered retrieval
+
+Unresolved questions:
+
+* The "sort incoming keys" part is trivial when we have only equality ranges.
+  If we allow ranges of arbitrary form ( ncluding ranges with one endpoint 
+  being infinity or ranges overlapping with one another), sorting becomes
+  non-trival. Do we need to support this case or support only equality ranges?
+
+* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
+  PKs?
+
+* Do we support scanning on a prefix of clustered PK? (seems to be yes?)



DESCRIPTION:

Currently, DS-MRR doesn't allow to do MRR scans over clustered primary keys. The
reason for this is that
 - Clustered primary keys are stored in disk order, so, if the sequence of
   scanned ranges is ordered, the reads will automatically happen in disk
   order, and DS-MRR's step of re-ordering reads is redundant.
 - Within DS-MRR implementation, the "get rowids from keys" step is not
   necessary when using clustered primary key, because in InnoDB/XtraDB
   clustered primary key values are the rowids.

However, when MRR calls are made by BKA, there are cases where DS-MRR over
clustered primary key is beneficial:

* BKA may provide lookup keys that have duplicates and/or are in arbitrary
  order. In that case, DS-MRR implementation may sort the key values and 
  order them, so that it hits the disk in key(=disk) order.

* When running multi-table join with high @@join_cache_level value (and so,
  linked join buffers), lack of MRR implementation causes the chain of linked
  join buffers to break. (TODO and so what? Is that really a problem?)

* TODO anything else?

This WL entry is about addressing the above by adding support of DS-MRR over
clustered primary key that would work according to this strategy:
1. Sort incoming keys 
2. Use the sorted keys to do a disk-ordered retrieval.



HIGH-LEVEL SPECIFICATION:



1. Choices to be made
---------------------

1.1 Handling of complex ranges
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The "sort incoming keys" part is easy when we have only equality ranges.
If we allow ranges of arbitrary form (including ranges with one endpoint
being infinity and/or ranges overlapping with one another), sorting becomes
non-trivial. Do we need to support this case or support only equality ranges?

Decision: the new code should handle only the case with equality ranges.
For non-equality ranges, the execution will proceed as before.

1.2 Handling index prefix scans
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What do we do if asked to do a scan on a prefix of clustered PK? 

Decision: handle this if the ranges are equality ranges. The difference from
scan on full primary key is that 
- we will have to use read_range_XXX() or index_read()/index_next_same() 
  functions, while for full primary key value we could have used rnd_pos().
- One equality range can produce multiple matching records.

1.3 Use of knowledge that primary_key==rowid
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
PKs? 
Decision: don't make this assumption.


2. Code-level changes overview
------------------------------

DsMrr_impl::choose_mrr_impl():
Enable MRR when 
  - ihis is a clustered primary key
  - incoming ranges are single-point (HA_MRR_SINGLE_POINT is set)
     - will need to make the SQL layer to set this flag
  - incoming ranges are not already sorted (HA_MRR_SORTED is not set)

(TODO do we need new cost formula?)

DsMrr_impl::dsmrr_init() 
 - different elem_size (not rowid length but key tuple length)
 - don't create the secondary handler object, we won't need it.

DsMrr_impl::dsmrr_fill_buffer():
 - need a variant of this function that would not access the index but just
   fill and sort the array.

DsMrr_impl::dsmrr_next():
 - should abstract-out:
   - buffer element size
   - rnd_pos/index_read call.
      - Also for CPK prefix scans there can be multi


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)