← Back to team overview

maria-developers team mailing list archive

Re: [Commits] Rev 2994: Original idea from Zardosht Kasheff to add HA_CLUSTERED_INDEX in lp:maria/5.3



>>>>> "Sergei" == Sergei Golubchik <serg@xxxxxxxxxxxx> writes:


>> +  Data is clustered on this key. This means that when you read the key
>> +  you also get the row data in the same block.

Sergei> perhaps, better to say that "you get the row data without an extra seek
Sergei> (or even without an extra disk read)".

I added the comment.

Sergei> A question: what about memory-based (and SSD) engines? Could/Should one
Sergei> consider all indexes clustered?

No, as there is still a high cost for fetching and managing an extra
disk block.  SSD are faster than disk, but not unlimited fast.

For SSD, we need to add a different factor for the time to fetch a
block.  The storage engine can already take this into account a bit by
adjusting the times for scan / key reads if things are on ssd.

I have been discussing with igor for making all the timing constants
variables to allow us to do real time tuning with them.
(Good for finding better values for the constants that we have now).


>> -   @retval TRUE   Primary key (if there is one) is clustered
>> -                  key covering all fields
>> -   @retval FALSE  otherwise
>> +   Check if the primary key (if there is one) is a clustered key covering
>> +   all fields. This means:

Sergei> I wouldn't say that it "checks if the primary key is clustered", but
Sergei> something like "historical method that returns TRUE if everything below
Sergei> holds:"

I don't like to say 'historical' as there is no plans to remove this
code, only rename it...

I did change the start to:

   Check if the primary key (if there is one) is a clustered key
   covering all fields and that the primary key is also stored as part
   of all other keys and used as a reference key.

>> +
>> +   - Data is stored together with the primary key (no secondary lookup
>> +     needed to find the row data). The optimizer uses this to find out
>> +     the cost of fetching data.
>> +   - The primary key is part of each secondary key and is used
>> +     to find the row data in the primary index when reading trough
>> +     secondary indexes.
>> +   - When doing a HA_KEYREAD_ONLY we get also all the primary key parts
>> +     into the row. This is critical property used by index_merge.

Sergei> and here "all the above is usually true for engines that store the row
Sergei> data in the primary key index (e.g. in a b-tree), and use the primary
Sergei> key value as a position()"



>> +++ b/sql/sql_select.cc	2011-05-18 16:26:30 +0000
>> @@ -16525,9 +16527,9 @@ test_if_skip_sort_order(JOIN_TAB *tab,OR
>> */
>> DBUG_ASSERT (ref_key != (int) nr);
>> -        bool is_covering= table->covering_keys.is_set(nr) ||
>> -                          (nr == table->s->primary_key &&
>> -                          table->file->primary_key_is_clustered());
>> +        bool is_covering= (table->covering_keys.is_set(nr) ||
>> +                           (table->file->index_flags(nr, 0, 1) &
>> +                            HA_CLUSTERED_INDEX));

Sergei> See, that's what I said. It make sense to set covering_keys bitmap
Sergei> for all clustered indexes once, when the table is opened, and then you
Sergei> not only get this condition simpler, but probably will also cover other
Sergei> cases where table->covering_keys is checked (or may be checked in the
Sergei> future).

Looking at how covering keys is used in many places, I am not sure
that change is straightforward or will not cause wrong timing
results.  This is because in many cases we don't want to use a
clustered key when we are trying to do index-only reads as the
covering key is much slower than any other key.

I agree that your idea has some merits, but I would not like to try to
do it just now.  Something you can look at when you are ready with


Follow ups