← Back to team overview

maria-developers team mailing list archive

Re: Next steps in improving single-threaded performance

 

Hi,

Kristian Nielsen wrote:
> I have been analysing CPU bottlenecks in single-threaded sysbench read-only
> load. I found that icache misses is the main bottleneck, and that
> profile-guided compiler optimisation (PGO) with GCC gives a large speedup, 25%
> or more.

Here are some more results.

Benchmark 1 is good old sysbench OLTP. I tested 10.0.7 vs. 10.0.7-pgo. With
low concurrency there is about 10% win by PGO; however this is completely
reversed at higher concurrency by mutex contention (the test was with
performance schema disabled, so cannot say which mutex, probably LOCK_open).

Normally I run with preloaded tcmalloc. However since 10.0.5(?) MariaDB uses
jemalloc internally. Since this is built with MariaDB, it could benefit from
PGO. However number look quite similar for tcmalloc vs. jemalloc.


The other benchmark is purely single threaded and runs Q1 from DBT3 for
memory based data. Here I include data for many MariaDB and MySQL versions
for comparison. The plot is a classical box-and-whiskers plot where the box
contains 50% of the data points (25-75 percentile) and the whiskers mark
minimum and maximum.

This time the win is about 5% for MariaDB-10.0.8 and ~ 0 for MariaDB-5.5.35.
However those results should be taken with a grain of salt as those builds
have been done with older gcc-4.6.3. I'll have to re-run with gcc-4.7.2
builds (but on different hardware).


BR, XL

GIF image

Attachment: series50.ods
Description: application/vnd.oasis.opendocument.spreadsheet

PNG image


Follow ups

References