← Back to team overview

maria-developers team mailing list archive

Re: Analysing MariaDB 5.5 sysbench performance regression

 

MARK CALLAGHAN <mdcallag@xxxxxxxxx> writes:

>> Another is my_hash_sort_simple(). This is not a regression, but Oprofile shows
>> we spend 10% of total time in this function.
>
> Does this do utf aware compares or memcmp?

This one uses collations - so not memcmp. It is used for SELECT DISTINCT and
GROUP BY. This particular test was not utf8 though, it was using an 8-bit
charset (probably latin1).

The utf-8 version of this looks significantly more expensive.

Monty will fix the heap tables to remove this overhead.

> For the old-style checksum that you describe above "gcc -O3" was much
> faster than "gcc -O2" using an older version (maybe 4.2) of gcc. If
> you are willing to require a dump/reload and break InnoDB binary
> compatability (I don't recommend this) then some results are at:
>  http://mysqlha.blogspot.com/2009/05/innodb-checksum-performance.html

I checked the assembler, compilation was optimal, it is just an expensive
algorithm - minimum 7 cycles per each of 16K bytes in the page.

33.6% is a really high overhead! I suppose this comes from heavy buffer pool
flushing, causing new checksum calculation for every few rows inserted. And
maybe from cache misses going over each buffer pool page.

> The Facebook patch added support for crc32 checksums that could use
> crc32 instructions on Intel HW. Official MySQL then adding something
> like that to one of their new releases.

Ok, that's good - best if data format changes are coordinated by InnoDB
upstream.

Thanks,

 - Kristian.


References