maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #03774
Re: problem with partitioning and our storage engine in 5.2
Our storage engine does not lie to the optimizer. We do not set the
HA_STATS_RECORDS_IS_EXACT flag. I do not know this code well enough to
understand how the relevant code in opt_sum.cc results in the
partition code I describe get executed.
What I do not understand is this. Why does the following code at line
4734 of ha_partition.cc (at the bottom of this message) simply assume
that if file->stats.records is 0, that it can blindly return
HA_ERR_END_OF_FILE? I understand if file->ha_table_flags exposes
HA_STATS_RECORDS_IS_EXACT, that it can do this, but that is not
checked here.
-Zardosht
case partition_index_last:
/*
MyISAM engine can fail if we call index_last() when indexes disabled
that happens if the table is empty.
Here we use file->stats.records instead of file->records() because
file->records() is supposed to return an EXACT count, and it can be
possibly slow. We don't need an exact number, an approximate one- from
the last ::info() call - is sufficient.
*/
if (file->stats.records == 0)
{
error= HA_ERR_END_OF_FILE;
break;
}
error= file->ha_index_last(rec_buf_ptr);
reverse_order= TRUE;
break;
-Zardosht
On Wed, Dec 8, 2010 at 5:34 PM, Timour Katchaounov <timour@xxxxxxxxxxxx> wrote:
> Zardosht,
>
>> I think we found out the problem. In ha_partition.cc, in the function
>> ha_partition::handle_ordered_index_scan, this code was added:
>> case partition_index_last:
>> /*
>> MyISAM engine can fail if we call index_last() when indexes
>> disabled
>> that happens if the table is empty.
>> Here we use file->stats.records instead of file->records() because
>> file->records() is supposed to return an EXACT count, and it can
>> be
>> possibly slow. We don't need an exact number, an approximate one-
>> from
>> the last ::info() call - is sufficient.
>> */
>> if (file->stats.records == 0)
>> {
>> error= HA_ERR_END_OF_FILE;
>> break;
>>
>> We always thought that stats.records was meant to be an estimate, and
>> that an estimate of 0 was ok even if the table was non-empty. We were
>> reporting that stats.records was 0 even though the table was
>> non-empty. Is this assumption wrong?
>
> The assumption is not generally wrong, however there are two kinds of
> storage engines in this respect. The first kind promises to provide
> exact count of rows, MyISAM is an example. The second kind doesn't
> promise exact counts, for instance InnoDB. The optimizer checks this,
> and if it can rely on the exact count, then it optimizes count(*)
> whenever possible (for count(*) queries without GROUP BY).
>
> The relevant code is in opt_sum.cc. The function opt_sum_query
> performs this test:
> ...
> if (!(tl->table->file->ha_table_flags() & HA_STATS_RECORDS_IS_EXACT) ||
> tl->schema_table)
> ...
>
> The question is what is your engine's implementation of
> your_handler_class::ha_table_flags()? If you "lie" to the optimizer
> that the count is exact, but then treat is an approximate value,
> you'll get the above wrong result for sure.
>
> If this is not the problem, then please post a reproducible example.
>
> Timour
>
>>
>> -Zardosht
>>
>> On Wed, Dec 8, 2010 at 2:42 PM, Kristian Nielsen
>> <knielsen@xxxxxxxxxxxxxxx> wrote:
>>>
>>> Zardosht Kasheff<zardosht@xxxxxxxxx> writes:
>>>
>>>> We have been working on testing our storage engine, TokuDB, against
>>>> MariaDB 5.2.3, and we have encountered a problem with partitioning
>>>> that does not exist on MySQL 5.1, MySQL 5.5, and MariaDB 5.1.50. This
>>>> problem also does not exist with any other storage engine that we have
>>>> tried. It ONLY exists with TokuDB and MariaDB 5.2.
>>>
>>>> Here is what the test is doing:
>>>>
>>>> 1. Create a table with some partitions
>>>> 2. update the table
>>>> 3. run select max(f_int1)
>>>> 4. run select *
>>>> 5. run select max(f_int1) again.
>>>>
>>>> The problem is that the query results for 3 are incorrect, even though
>>>> the query results for 4 and 5 are correct. MySQL 5.1, MySQL 5.5, and
>>>> MariaDB 5.1 produce the correct results for 3.
>>>>
>>>> What makes this even stranger is that the query results for 4 and 5
>>>> are always correct.
>>>
>>> It's hard to say without any code. Can you post link to launchpad tree or
>>> source tarball with which to reproduce?
>>>
>>> From the information given, I would suggest to run the test with
>>> --valgrind,
>>> in case the problem is some uninitialised memory being referenced;
>>> sometimes
>>> Valgrind can help catch this.
>>>
>>> - Kristian.
>>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-developers
>> Post to : maria-developers@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~maria-developers
>> More help : https://help.launchpad.net/ListHelp
>
Follow ups
References