maria-developers team mailing list archive

Thread
Date

Re: Optimization about count(*) of mysql 5.1.38

To: Sergei Golubchik <serg@xxxxxxxxxxxx>
From: Lichao Xie(谢立超) <xielichao1984@xxxxxxxxx>
Date: Thu, 20 Jan 2011 23:57:01 +0800
Cc: maria-developers@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110120111004.GA18745@janus.mylan>

> Hi, Lichao!
>
> On Jan 10, Lichao Xie wrote:
>> hi all:
>> I have a question about the optimization of count(*) in a range.
>> Sql for mysql like:select count(*) from t where id > 10 and id < 1000000;
>> In table t, there is an B-tree index on id field, this sql will read
>> hundreds of thousands of records from the storage engine, that is a
>> time-consuming operation.
>> I am developing a storage engine, I want to know, is there a good
>> method to caculate the records in a range, rather than read all
>> records or keys in the range?
>
> Yes, see how MyISAM does it - it's pretty straightforward, and only
> requires two index lookups.

MyISAM seemingly read all keys in the range. And I test MyISAM table
use mysqlslap, the test case is:
./mysqlslap --query="select count(xn_id) as num from roman1.r_paihang
where  dengji > 4 and dengji < 11" --number-of-queries=10000000
--concurrency=30 -uroot --host=10.134.11.17 --port=10001

the result is :
10.134.11.17:/data/mqq_data$ mysqladmin extended-status -i 1 -r -uroot
-S /tmp/mysql-tdmdb-14.sock | grep Com_select
| Com_select                 | 41115084   |
| Com_select                 | 9          |
| Com_select                 | 14         |
| Com_select                 | 23         |
| Com_select                 | 25         |
| Com_select                 | 36         |
| Com_select                 | 8          |
| Com_select                 | 26         |
| Com_select                 | 34         |
| Com_select                 | 24         |
| Com_select                 | 16         |
| Com_select                 | 10         |
| Com_select                 | 8          |

the table scheme is:
CREATE TABLE `r_paihang` (
 `xn_id` int(11) NOT NULL DEFAULT '0',
 `company` int(8) DEFAULT NULL,
 `dengji` int(6) DEFAULT NULL,
 `position` tinyint(3) DEFAULT '1',
 `tili` int(11) DEFAULT '0',
 `zhili` int(11) DEFAULT '0',
 `meili` int(11) DEFAULT '0',
 `worth` int(11) DEFAULT '0',
 `sex` int(1) DEFAULT '0',
 `online_time` int(11) DEFAULT NULL,
 `sign_time` datetime DEFAULT NULL,
 `last_time` datetime DEFAULT NULL,
 `yq_id` int(12) NOT NULL DEFAULT '0',
 PRIMARY KEY (`xn_id`),
 KEY `company` (`company`),
 KEY `worth` (`worth`),
 KEY `zhili` (`zhili`),
 KEY `position` (`position`),
 KEY `meili` (`meili`),
 KEY `tili` (`tili`),
 KEY `sex` (`sex`),
 KEY `dengji` (`dengji`),
 KEY `sign_time` (`sign_time`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

the case runs too slow....
If there is a storage engine api like records_in_range() to get the
exact records in the range from the storage engine, the result will be
10^4 qps for this case...

>
> Regards,
> Sergei
>

2011/1/20 Sergei Golubchik <serg@xxxxxxxxxxxx>:
> Hi, Lichao!
>
> Please send questions like this to maria-developers@xxxxxxxxxxxxxxxxxxx.
> In your replies make sure send replies to the list, e.g. use "Reply to All".
> It's a public mailing list dedicated to MySQL and MariaDB internals, source
> code, and related things. I am subscribed, so I'll see you mail there, and you
> may be sure I will, because it won't be accidentally catched by my spam filter,
> or sorted out in some obscure folder. Furthermore other subscribers will see
> your question and could reply if I will be not available (e.g. I could be
> travelling).
> Thank you.

Sarry, my fault!
I am resending the email.
>
> On Jan 20, Lichao Xie wrote:
>> Hi, Sergei
>>
>> 2011/1/20 Sergei Golubchik <serg@xxxxxxxxxxxx>:
>> >
>> > On Jan 10, Lichao Xie wrote:
>> >> hi all:
>> >> I have a question about the optimization of count(*) in a range.
>> >> Sql for mysql like:select count(*) from t where id > 10 and id < 1000000;
>> >> In table t, there is an B-tree index on id field, this sql will read
>> >> hundreds of thousands of records from the storage engine, that is a
>> >> time-consuming operation.
>> >> I am developing a storage engine, I want to know, is there a good
>> >> method to caculate the records in a range, rather than read all
>> >> records or keys in the range?
>> >
>> > Yes, see how MyISAM does it - it's pretty straightforward, and only
>> > requires two index lookups.
>>
> Regards,
> Sergei
>

Follow ups

Re: Optimization about count(*) of mysql 5.1.38
From: Sergei Golubchik, 2011-01-20

References

Optimization about count(*) of mysql 5.1.38
From: 谢立超, 2011-01-10
Re: Optimization about count(*) of mysql 5.1.38
From: Sergei Golubchik, 2011-01-19