maria-developers team mailing list archive

Thread
Date
Updated (by Psergey): Add DS-MRR support for clustered primary keys (121)

To: igor@xxxxxxxxxxxx
From: worklog-noreply@xxxxxxxxxxxx
Date: Sat, 12 Jun 2010 08:39:46 +0000 (UTC)
Delivered-to: worklog-db@xxxxxxxxxxxx
-----------------------------------------------------------------------
                              WORKLOG TASK
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
TASK...........: Add DS-MRR support for clustered primary keys
CREATION DATE..: Sat, 12 Jun 2010, 08:23
SUPERVISOR.....: Igor
IMPLEMENTOR....: 
COPIES TO......: 
CATEGORY.......: Client-BackLog
TASK ID........: 121 (http://askmonty.org/worklog/?tid=121)
VERSION........: Benchmarks-3.0
STATUS.........: Un-Assigned
PRIORITY.......: 60
WORKED HOURS...: 0
ESTIMATE.......: 0 (hours remain)
ORIG. ESTIMATE.: 0

PROGRESS NOTES:

-=-=(Psergey - Sat, 12 Jun 2010, 08:39)=-=-
High-Level Specification modified.
--- /tmp/wklog.121.old.538      2010-06-12 08:39:46.000000000 +0000
+++ /tmp/wklog.121.new.538      2010-06-12 08:39:46.000000000 +0000
@@ -1 +1,16 @@
+Basic idea: DS-MRR scan should be done as follows:
 
+1. Sort incoming keys 
+2. Use the sorted keys to do a disk-ordered retrieval
+
+Unresolved questions:
+
+* The "sort incoming keys" part is trivial when we have only equality ranges.
+  If we allow ranges of arbitrary form ( ncluding ranges with one endpoint 
+  being infinity or ranges overlapping with one another), sorting becomes
+  non-trival. Do we need to support this case or support only equality ranges?
+
+* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
+  PKs?
+
+* Do we support scanning on a prefix of clustered PK? (seems to be yes?)



DESCRIPTION:

Currently, DS-MRR doesn't support operation over clustered primary keys. The
reason for this was that
 - Clustered primary keys are stored in disk order and so, if the sequence of 
   ranges is ordered, the reads will already go in disk order (and so DS-MRR's
   step of re-ordering reads is not necessary).
 - Within DS-MRR implementation, the "get rowids from keys" step is not
   necessary when using clustered primary key, because in InnoDB/XtraDB 
   clustered primary key values are the rowids.

However, with BKA making the MRR calls, there are cases where DS-MRR over
clustered primary key is beneficial:

* BKA may provide lookup keys that have duplicates and/or are in arbitrary
  order. In that case, DS-MRR implementation may sort the key values and 
  order them, so that it hits the disk in key order.

* When running multi-table join with high @@join_cache_level value (and so,
  linked join buffers), lack of MRR implementation causes the chain of linked
  join buffers to break. (TODO and so what? Is that really a problem?)

* TODO anything else?


HIGH-LEVEL SPECIFICATION:



Basic idea: DS-MRR scan should be done as follows:

1. Sort incoming keys 
2. Use the sorted keys to do a disk-ordered retrieval

Unresolved questions:

* The "sort incoming keys" part is trivial when we have only equality ranges.
  If we allow ranges of arbitrary form ( ncluding ranges with one endpoint 
  being infinity or ranges overlapping with one another), sorting becomes
  non-trival. Do we need to support this case or support only equality ranges?

* Can/should we use the fact rowid=={clustered PK value} for InnoDB's clustered
  PKs?

* Do we support scanning on a prefix of clustered PK? (seems to be yes?)


ESTIMATED WORK TIME

ESTIMATED COMPLETION DATE
-----------------------------------------------------------------------
WorkLog (v3.5.9)