← Back to team overview

maria-developers team mailing list archive

Re: Optimization about count(*) of mysql 5.1.38

 

Hi, Lichao!

On Jan 20, Lichao Xie wrote:
> >> hi all:
> >> I have a question about the optimization of count(*) in a range.
> >> Sql for mysql like:select count(*) from t where id > 10 and id < 1000000;
> >> In table t, there is an B-tree index on id field, this sql will read
> >> hundreds of thousands of records from the storage engine, that is a
> >> time-consuming operation.
> >> I am developing a storage engine, I want to know, is there a good
> >> method to caculate the records in a range, rather than read all
> >> records or keys in the range?
> >
> > Yes, see how MyISAM does it - it's pretty straightforward, and only
> > requires two index lookups.
> 
> MyISAM seemingly read all keys in the range. And I test MyISAM table
> use mysqlslap, the test case is:

Right. For the COUNT(*) query it does - MySQL does, there is no way
around it.

> the case runs too slow....
> If there is a storage engine api like records_in_range() to get the
> exact records in the range from the storage engine, the result will be
> 10^4 qps for this case...

No, there is no such a API. The best you can do in your engine is to pay
attention to TABLE::read_set and HA_EXTRA_KEYREAD which will allow you
to reduce the amount of work you do per key. But still it will be the
complete range scan - from the lower to the upper boundary or the range.

Regards,
Sergei



Follow ups

References