maria-developers team mailing list archive

Thread
Date
Re: proposed patches for MWL#113: many clustered keys (not only primary)

To: monty@xxxxxxxxxxxx
From: Zardosht Kasheff <zardosht@xxxxxxxxx>
Date: Wed, 18 May 2011 12:36:33 -0400
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <19923.62197.885761.981858@narttu.askmonty.org>
Hello Monty,

Thank you for your feedback on the four diffs and for approving them.
This makes getting TokuDB to work with MariaDB much simpler.

I could not tell from the original email, but is the change Sergei
made in http://lists.askmonty.org/pipermail/commits/2010-October/000626.html
going to be committed as well?

Thanks
-Zardosht

On Wed, May 18, 2011 at 12:25 PM, Michael Widenius <monty@xxxxxxxxxxxx> wrote:
>
> Hi!
>
> Here comes the feedback, patch by patch.
>
> Those that are ok 'as such or with small modifications' I will add to
> the 5.3 tree at once.
>
>>>>>> "Zardosht" == Zardosht Kasheff <zardosht@xxxxxxxxx> writes:
>
> Zardosht> Hello all,
> Zardosht> First off, thanks for hosting the storage engine summit last Friday.
>
> Zardosht> At the beginning of the day, I was encouraged to send the patches that
> Zardosht> I had for MWL#113 to the MariaDB developers list, in hopes possibly
> Zardosht> getting them into MariaDB 5.3. Here they are. They are patched off of
> Zardosht> MariaDB 5.2.3.
>
> Zardosht> They are all relatively simple. Here is what the patches do.
> Zardosht>  - 1.diff - add an index flag HA_CLUSTERED_INDEX. If a storage engine
> Zardosht> exposes this flag for an index, then that index is clustered. This was
> Zardosht> proposed in http://bugs.mysql.com/bug.php?id=51687
>
>
> 1.diff
>
>> Index: sql/handler.h
>> ===================================================================
>> --- sql/handler.h     (revision 25779)
>> +++ sql/handler.h     (revision 25780)
>> @@ -151,6 +151,10 @@
>>  #define HA_READ_RANGE           8       /* can find all records in a range */
>>  #define HA_ONLY_WHOLE_INDEX  16      /* Can't use part key searches */
>>  #define HA_KEYREAD_ONLY         64   /* Support HA_EXTRA_KEYREAD */
>> +// no IO if read data when scan index
>> +//  i.e index is covering
>> +// skipping 128 because SOME product has HA_KEY_SCAN_NOT_ROR
>> +#define HA_CLUSTERED_INDEX      256
>
>  /*
>   bits in alter_table_flags:
>
> I changed the above to use 512, as this is the next free flag. I also
> added this comment:
>
> /*
>  Data is clustered on this key. This means that when you read the key
>  you also get the row data in the same block.
> */
>
> -------------
> I also added new comments in the code for how clustered data was used.
> I also removed some calls to table->file->primary_key_is_clustered().
>
> The definition for primary_key_is_clustered() is after this as
> follows:
>
> /*
>   Check if the primary key (if there is one) is a clustered key covering
>   all fields. This means:
>
>   - Data is stored together with the primary key (no secondary lookup
>     needed to find the row data). The optimizer uses this to find out
>     the cost of fetching data.
>   - The primary key is part of each secondary key and is used
>     to find the row data in the primary index when reading trough
>     secondary indexes.
>   - When doing a HA_KEYREAD_ONLY we get also all the primary key parts
>     into the row. This is critical property used by index_merge.
>
>   For a clustered primary key, index_flags() returns also HA_CLUSTERED_INDEX
>
>   @retval TRUE   yes
>   @retval FALSE  No.
> */
>
> I also fixed the part in the code where it was possible to use the new
> HA_CLUSTERED_INDEX flag and changed get_best_ror_intersect() to prefer
> non clustered indexes before clustered index.
>
> Code is now commited to my 5.3 tree; Will be pushed shortly.
>
> -----------
>
> Zardosht>  - 2.diff - simple fix to get select ... order by DESC working
>
> My changes for 1.diff fixed this.
>
> The code is now:
>
> Here is your change:
>
>> +          bool is_covering= table->covering_keys.is_set(nr) ||
>> +                            (nr == table->s->primary_key &&
>> +                        table->file->primary_key_is_clustered()) ||
>> +                        test(table->file->index_flags(nr, 0, 0) & HA_CLUSTERED_INDEX);
>>
>> -        bool is_covering= table->covering_keys.is_set(nr) ||
>> -                          (nr == table->s->primary_key &&
>> -                          table->file->primary_key_is_clustered());
>> -
>
> Here is the code after my change:
>
>        bool is_covering= (table->covering_keys.is_set(nr) ||
>                           (table->file->index_flags(nr, 0, 1) &
>                            HA_CLUSTERED_INDEX));
>
> I was able to remove the test for
> table->file->primary_key_is_clustered() as I fixed all innodb code to
> set HA_CLUSTERED_INDEX for clustered keys.
>
> Zardosht>  - 3.diff - fix for index merge and clustering keys. This was inspired
> Zardosht> by Sergey Petrunya originally proposing
> Zardosht> http://lists.mysql.com/internals/37165. It has been altered a bit now
> Zardosht> that HA_CLUSTERED_INDEX is added.
>
>> --- sql/opt_range.cc  (revision 25783)
>> +++ sql/opt_range.cc  (revision 25784)
>> @@ -4497,8 +4497,20 @@
>>    for (idx= 0, cur_ror_scan= tree->ror_scans; idx < param->keys; idx++)
>>    {
>>      ROR_SCAN_INFO *scan;
>> +    uint keyno= param->real_keynr[idx];
>> +
>>      if (!tree->ror_scans_map.is_set(idx))
>> +    {
>>        continue;
>> +    }
>> +    /*
>> +      Ignore clustering keys.
>> +    */
>> +    if (keyno != cpk_no && test(param->table->file->index_flags(keyno,0,0) & HA_CLUSTERED_INDEX))
>> +    {
>> +      tree->n_ror_scans--;
>> +      continue;
>> +    }
>>      if (!(scan= make_ror_scan(param, idx, tree->keys[idx])))
>>        return NULL;
>>      if (param->real_keynr[idx] == cpk_no)
>
> Hm.. I originally changed this code to put the clustered keys last.
> The idea was to make it possible to work with engines that supports
> multiple clustered keys but still prefer normal keys first.
>
> Have now changed the code to use your approach instead.
>
> Zardosht>  - 4.diff - extend fix of MySQL bug 50843 for clustering keys
> Zardosht>  - http://lists.askmonty.org/pipermail/commits/2010-October/000626.html.
> Zardosht> I cannot find this checked in anywhere, but it seems to be a good
> Zardosht> patch for clustering keys. This was inspired from
> Zardosht> http://www.mail-archive.com/maria-developers@xxxxxxxxxxxxxxxxxxx/msg03491.html.
> Zardosht> We currently have a hacky workaround for this issue, but this seems
> Zardosht> like the right fix.
>
> This part also changed in my original patch.
>
>> +
>> +      Also not in case where the index is a clustering index
>>      */
>>      if ((select_limit >= table_records) &&
>>          (tab->type == JT_ALL &&
>>           tab->join->tables > tab->join->const_tables + 1) &&
>>           ((unsigned) best_key != table->s->primary_key ||
>> -          !table->file->primary_key_is_clustered()))
>> +          !table->file->primary_key_is_clustered()) &&
>> +          !(best_key >= 0 && test(table->file->index_flags(best_key,0,0) & HA_CLUSTERED_INDEX))
>> +          )
>>        DBUG_RETURN(0);
>
> The new code looks like:
>
>    if (best_key < 0 ||
>        ((select_limit >= table_records) &&
>         (tab->type == JT_ALL &&
>          tab->join->tables > tab->join->const_tables + 1) &&
>         !(table->file->index_flags(best_key, 0, 1) & HA_CLUSTERED_INDEX)))
>      goto use_filesort;
>
> No reson to test for table->file->primary_key_is_clustered() anymore.
>
> Zardosht> I also have one more fix for clustering keys and joins (in
> Zardosht> make_join_readinfo), but I want to look over it and before sending it.
>
> ok.
>
> Zardosht> Also, if for whatever reason all of the patches are not good enough
> Zardosht> for checking in, please consider some subset of the patches. This is
> Zardosht> by no means "all or none". That is why I broke it up into separate
> Zardosht> diff files.
>
> No problem; All changes are in.
> I will push as soon as I have got an ok from Serg / Spetrunia.
>
> Regards,
> Monty
>
Follow ups

Re: proposed patches for MWL#113: many clustered keys (not only primary)
From: Michael Widenius, 2011-05-19
Re: proposed patches for MWL#113: many clustered keys (not only primary)
From: Sergei Golubchik, 2011-05-18
References

proposed patches for MWL#113: many clustered keys (not only primary)
From: Zardosht Kasheff, 2011-04-21
Re: proposed patches for MWL#113: many clustered keys (not only primary)
From: Michael Widenius, 2011-05-18