maria-developers team mailing list archive

Thread
Date

Re: [Commits] Rev 2994: Original idea from Zardosht Kasheff to add HA_CLUSTERED_INDEX in lp:maria/5.3

To: monty@xxxxxxxxxxxx
From: Sergei Golubchik <serg@xxxxxxxxxxxx>
Date: Thu, 19 May 2011 14:43:26 +0200
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110518162631.083649FAD4@askmonty.fi>
User-agent: Mutt/1.5.16 (2007-06-09)

Hi, Michael!

On May 18, Michael Widenius wrote:
> === modified file 'sql/handler.h'
> --- a/sql/handler.h	2011-05-16 11:05:45 +0000
> +++ b/sql/handler.h	2011-05-18 16:26:30 +0000
> @@ -161,8 +161,11 @@
>  */
>  #define HA_KEY_SCAN_NOT_ROR     128 
>  #define HA_DO_INDEX_COND_PUSHDOWN  256 /* Supports Index Condition Pushdown */
> -
> -
> +/*
> +  Data is clustered on this key. This means that when you read the key
> +  you also get the row data in the same block.

perhaps, better to say that "you get the row data without an extra seek
(or even without an extra disk read)".

A question: what about memory-based (and SSD) engines? Could/Should one
consider all indexes clustered?

> +*/
> +#define HA_CLUSTERED_INDEX      512
>  
>  /*
>    bits in alter_table_flags:
> @@ -2311,9 +2314,22 @@ class handler :public Sql_alloc
>  
>  
>   /*
> -   @retval TRUE   Primary key (if there is one) is clustered
> -                  key covering all fields
> -   @retval FALSE  otherwise
> +   Check if the primary key (if there is one) is a clustered key covering
> +   all fields. This means:

I wouldn't say that it "checks if the primary key is clustered", but
something like "historical method that returns TRUE if everything below
holds:"

> +
> +   - Data is stored together with the primary key (no secondary lookup
> +     needed to find the row data). The optimizer uses this to find out
> +     the cost of fetching data.
> +   - The primary key is part of each secondary key and is used
> +     to find the row data in the primary index when reading trough
> +     secondary indexes.
> +   - When doing a HA_KEYREAD_ONLY we get also all the primary key parts
> +     into the row. This is critical property used by index_merge.

and here "all the above is usually true for engines that store the row
data in the primary key index (e.g. in a b-tree), and use the primary
key value as a position()"

> +
> +   For a clustered primary key, index_flags() returns also HA_CLUSTERED_INDEX
> +
> +   @retval TRUE   yes
> +   @retval FALSE  No.
>   */
>   virtual bool primary_key_is_clustered() { return FALSE; }
>   virtual int cmp_ref(const uchar *ref1, const uchar *ref2)
> 
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2011-05-16 12:07:04 +0000
> +++ b/sql/sql_select.cc	2011-05-18 16:26:30 +0000
> @@ -16525,9 +16527,9 @@ test_if_skip_sort_order(JOIN_TAB *tab,OR
>          */
>          DBUG_ASSERT (ref_key != (int) nr);
>  
> -        bool is_covering= table->covering_keys.is_set(nr) ||
> -                          (nr == table->s->primary_key &&
> -                          table->file->primary_key_is_clustered());
> +        bool is_covering= (table->covering_keys.is_set(nr) ||
> +                           (table->file->index_flags(nr, 0, 1) &
> +                            HA_CLUSTERED_INDEX));

See, that's what I said. It make sense to set covering_keys bitmap
for all clustered indexes once, when the table is opened, and then you
not only get this condition simpler, but probably will also cover other
cases where table->covering_keys is checked (or may be checked in the
future).

>          /* 
>            Don't use an index scan with ORDER BY without limit.
> === modified file 'sql/sql_table.cc'
> --- a/sql/sql_table.cc	2011-05-16 11:05:45 +0000
> +++ b/sql/sql_table.cc	2011-05-18 16:26:30 +0000
> @@ -7983,7 +7983,8 @@ copy_data_between_tables(TABLE *from,TAB
>  
>    if (order)
>    {
> -    if (to->s->primary_key != MAX_KEY && to->file->primary_key_is_clustered())
> +    if (to->s->primary_key != MAX_KEY &&
> +        to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX)

Hmm, good.

It took me a while to come up with a realistic use case where
to->file->primary_key_is_clustered() is false, but
to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX is true.

Example: new InnoDB optimization that makes it to use internal generated
primary key when user creates a PK of, say, VARCHAR(255). That is, in
this optimization InnoDB treat this key as UNIQUE but not PRIMARY.
Not completely unrealistic idea. In this case

  to->s->primary_key != MAX_KEY            is true, there is a primary key.
  to->file->primary_key_is_clustered()                  is false, but
  to->file->ha_table_flags() & HA_TABLE_SCAN_ON_INDEX   is true.

Heh, this is a good example, perhaps we need to apply it as a test to
other code that uses primary_key_is_clustered() and see if the condition
is used correctly.

>      {
>        char warn_buff[MYSQL_ERRMSG_SIZE];
>        my_snprintf(warn_buff, sizeof(warn_buff), 
> 
Regards,
Sergei

Follow ups

Re: [Commits] Rev 2994: Original idea from Zardosht Kasheff to add HA_CLUSTERED_INDEX in lp:maria/5.3
From: Michael Widenius, 2011-05-19