← Back to team overview

maria-developers team mailing list archive

Re: problem with partitioning and our storage engine in 5.2

 

Our storage engine does not lie to the optimizer. We do not set the
HA_STATS_RECORDS_IS_EXACT flag. I do not know this code well enough to
understand how the relevant code in opt_sum.cc results in the
partition code I describe get executed.

What I do not understand is this. Why does the following code at line
4734 of ha_partition.cc (at the bottom of this message) simply assume
that if file->stats.records is 0, that it can blindly return
HA_ERR_END_OF_FILE? I understand if file->ha_table_flags exposes
HA_STATS_RECORDS_IS_EXACT, that it can do this, but that is not
checked here.

-Zardosht

    case partition_index_last:
      /*
        MyISAM engine can fail if we call index_last() when indexes disabled
        that happens if the table is empty.
        Here we use file->stats.records instead of file->records() because
        file->records() is supposed to return an EXACT count, and it can be
        possibly slow. We don't need an exact number, an approximate one- from
        the last ::info() call - is sufficient.
      */
      if (file->stats.records == 0)
      {
        error= HA_ERR_END_OF_FILE;
        break;
      }
      error= file->ha_index_last(rec_buf_ptr);
      reverse_order= TRUE;
      break;


-Zardosht

On Wed, Dec 8, 2010 at 5:34 PM, Timour Katchaounov <timour@xxxxxxxxxxxx> wrote:
> Zardosht,
>
>> I think we found out the problem. In ha_partition.cc, in the function
>> ha_partition::handle_ordered_index_scan, this code was added:
>>     case partition_index_last:
>>       /*
>>         MyISAM engine can fail if we call index_last() when indexes
>> disabled
>>         that happens if the table is empty.
>>         Here we use file->stats.records instead of file->records() because
>>         file->records() is supposed to return an EXACT count, and it can
>> be
>>         possibly slow. We don't need an exact number, an approximate one-
>> from
>>         the last ::info() call - is sufficient.
>>       */
>>       if (file->stats.records == 0)
>>       {
>>         error= HA_ERR_END_OF_FILE;
>>         break;
>>
>> We always thought that stats.records was meant to be an estimate, and
>> that an estimate of 0 was ok even if the table was non-empty. We were
>> reporting that stats.records was 0 even though the table was
>> non-empty. Is this assumption wrong?
>
> The assumption is not generally wrong, however there are two kinds of
> storage engines in this respect. The first kind promises to provide
> exact count of rows, MyISAM is an example. The second kind doesn't
> promise exact counts, for instance InnoDB. The optimizer checks this,
> and if it can rely on the exact count, then it optimizes count(*)
> whenever possible (for count(*) queries without GROUP BY).
>
> The relevant code is in opt_sum.cc. The function opt_sum_query
> performs this test:
> ...
>    if (!(tl->table->file->ha_table_flags() & HA_STATS_RECORDS_IS_EXACT) ||
>        tl->schema_table)
> ...
>
> The question is what is your engine's implementation of
> your_handler_class::ha_table_flags()? If you "lie" to the optimizer
> that the count is exact, but then treat is an approximate value,
> you'll get the above wrong result for sure.
>
> If this is not the problem, then please post a reproducible example.
>
> Timour
>
>>
>> -Zardosht
>>
>> On Wed, Dec 8, 2010 at 2:42 PM, Kristian Nielsen
>> <knielsen@xxxxxxxxxxxxxxx>  wrote:
>>>
>>> Zardosht Kasheff<zardosht@xxxxxxxxx>  writes:
>>>
>>>> We have been working on testing our storage engine, TokuDB, against
>>>> MariaDB 5.2.3, and we have encountered a problem with partitioning
>>>> that does not exist on MySQL 5.1, MySQL 5.5, and MariaDB 5.1.50. This
>>>> problem also does not exist with any other storage engine that we have
>>>> tried. It ONLY exists with TokuDB and MariaDB 5.2.
>>>
>>>> Here is what the test is doing:
>>>>
>>>>    1. Create a table with some partitions
>>>>    2. update the table
>>>>    3. run select max(f_int1)
>>>>    4. run select *
>>>>    5. run select max(f_int1) again.
>>>>
>>>> The problem is that the query results for 3 are incorrect, even though
>>>> the query results for 4 and 5 are correct. MySQL 5.1, MySQL 5.5, and
>>>> MariaDB 5.1 produce the correct results for 3.
>>>>
>>>> What makes this even stranger is that the query results for 4 and 5
>>>> are always correct.
>>>
>>> It's hard to say without any code. Can you post link to launchpad tree or
>>> source tarball with which to reproduce?
>>>
>>>  From the information given, I would suggest to run the test with
>>> --valgrind,
>>> in case the problem is some uninitialised memory being referenced;
>>> sometimes
>>> Valgrind can help catch this.
>>>
>>>  - Kristian.
>>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-developers
>> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~maria-developers
>> More help   : https://help.launchpad.net/ListHelp
>



Follow ups

References