← Back to team overview

maria-developers team mailing list archive

Re: Optimization about count(*) of mysql 5.1.38


Hi Sergei

2011/1/21 Sergei Golubchik <serg@xxxxxxxxxxxx>:
> Hi, Lichao!
> On Jan 20, Lichao Xie wrote:
>> >> hi all:
>> >> I have a question about the optimization of count(*) in a range.
>> >> Sql for mysql like:select count(*) from t where id > 10 and id < 1000000;
>> >> In table t, there is an B-tree index on id field, this sql will read
>> >> hundreds of thousands of records from the storage engine, that is a
>> >> time-consuming operation.
>> >> I am developing a storage engine, I want to know, is there a good
>> >> method to caculate the records in a range, rather than read all
>> >> records or keys in the range?
>> >
>> > Yes, see how MyISAM does it - it's pretty straightforward, and only
>> > requires two index lookups.
>> MyISAM seemingly read all keys in the range. And I test MyISAM table
>> use mysqlslap, the test case is:
> Right. For the COUNT(*) query it does - MySQL does, there is no way
> around it.
>> the case runs too slow....
>> If there is a storage engine api like records_in_range() to get the
>> exact records in the range from the storage engine, the result will be
>> 10^4 qps for this case...
> No, there is no such a API. The best you can do in your engine is to pay
> attention to TABLE::read_set and HA_EXTRA_KEYREAD which will allow you
> to reduce the amount of work you do per key. But still it will be the
> complete range scan - from the lower to the upper boundary or the range.

Yes. But I want to know that with this use is based on what consideration?
the lock or something else?

> Regards,
> Sergei

Follow ups