maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #06942
Parallel replication benchmarks
I've done a set of benchmarks for parallel replication on the same machine I
used previously for my group commit benchmarks,
http://kristiannielsen.livejournal.com/16382.html
The code tested is the newest code in the bzr repository and what will be in
10.0.9 (this is significantly improved from what is in 10.0.8).
I plan to write up a blog post about it in a couple of days with nice graphs,
but meanwhile Axel asked me to summarise in this mail.
I tested with sysbench 0.5, using oltp.lua (medium-sized transactions) and
update_index.lua (minimal transactions with just a single primary-key update
per transaction). I used 10M rows, 16GB buffer pool and 2 * 1.9 GB redo logs.
This is with a single table.
I tested simply by preparing the binlog on the master, then setting up an
already prepared slave and doing START SLAVE UNTIL the end of the log. The
error log then shows the time spent for the slave to catch up. I tested
everyting in GTID mode, as that is the recommended mode for parallel
replication (though my guess is that old-style replication will be much the
same, there isn't much difference in the code between what is done to actually
execute events).
Node that these tests are for in-order parallel replication. All commits on
all slaves happen in the same order as on the master; the use of parallel
replication is invisible to applications. This is in contrast to eg. MySQL 5.6
multi-threaded slave, which requires the application to partition its data
into independent schemas.
Here are the prelimiary results, in number of seconds for the slave to catch
up (lower is better) versus number of threads (--slave-parallel-threads, 0
means not using parallel replication):
For oltp.lua. 48 threads used to generate the load on the master, and
--binlog-commit-wait-count=12 --binlog-commit-wait-usec=10000 to allow the
server to delay a commit by up to 10 milliseconds in order to get more group
commit and thus more opportunities for parallel apply on the slave:
A: --log-slave-updates --sync-binlog=1 --innodb-flush-log-at-trx-commit=1
B: --skip-log-slave-updates --innodb-flush-log-at-trx-commit=1
C: --skip-log-bin --innodb-flush-log-at-trx-commit=2
D: --log-bin=master-bin --sync-binlog=0 --innodb-flush-log-at-trx-commit=0
#thr A B C D
0 1065 869 193 202
2 361 432 147 161
4 221 264 118 121
8 135 177 103 107
12 114 153 104 105
16 109 140 104 107
24 111 139 107 105
32 111 136 99 109
48 111 126 108 109
64 111 121 99 111
We see here a 2-10 times speedup from parallel replication. The master has
around 12 transactions in every group commit, which provides good
opportunities for parallelism on the slave.
Note that parallel replication is especially effective when the binlog is
enabled and crash-safe (--sync-binlog=1
--innodb-flush-log-at-trx-commit=1). This is because parallel replication can
run the commit of one transaction in parallel with any other transaction, even
if the two transactions would otherwise conflict. This makes group commit
especially effective. In fact, this manages to more or less completely
eliminate any penalty for enabling crash-safe binlog on the slave, which is
quite nice.
Note also that disabling the binlog actually tends to make things _slower_,
not faster, when using parallel replication. I believe this is due to
MDEV-5802, which may be worth fixing for 10.0.
Here are results for update_simple.lua with 48 threads on the master. This
produced around 13 transactions per group commit on the master, with no
--binlog-commit-wait-count to delay commits:
A: --log-slave-updates --sync-binlog=1 --innodb-flush-log-at-trx-commit=1
B: --skip-log-slave-updates --innodb-flush-log-at-trx-commit=1
C: --skip-log-bin --innodb-flush-log-at-trx-commit=2
#thr A B C
0 931 899 271
2 546 653 258
4 365 494 176
8 261 365 203
12 247 350 197
16 233 336 207
24 242 316 209
32 237 292 194
48 235 270 208
64 228 249 195
Again we get a good speedup from parallel replication, even though with such
small transactions, there is less opportunity for improvement, as the actual
work for transactions is rather small compared to the overhead for managing
the replication of each event. And again, the ability to utilise group commit
effectively provides the biggest benefit.
Finally, I tried a test of update_index.lua where I ran the load on the master
single-threadedly. This creates a binlog with _no_ opportunities for
parallelism from group commits - each transaction needs to be executed on its
own by the slave, as we do not know for sure that they will not conflict on
row locks. However, due to the possibility to run the commits in parallel (and
hence get group commit on the slave), we still see some speedup even here when
--sync-binlog=1 and --innodb-flush-log-at-trx-commit=1. When binlog and innodb
sync is disabled, parallel replication makes things slower due to the overhead
of thread communication:
A: --log-slave-updates --sync-binlog=1 --innodb-flush-log-at-trx-commit=1
B: --skip-log-slave-updates --innodb-flush-log-at-trx-commit=1
C: --skip-log-bin --innodb-flush-log-at-trx-commit=2
#thr A B C
0 1075 949 270
2 597 673 319
4 443 623 334
8 407 588 349
12 393 544 336
16 391 536 352
24 - 492 336
32 389 472 358
48 389 419 344
64 391 399 354
So overall, results look very good, especially for slaves with binlog
enabled. (And binlog disabled could turn out better if MDEV-5802 is fixed).
Let me know if there are any questions, and I'll be happy to answer them.
- Kristian.