enterprise-support team mailing list archive
-
enterprise-support team
-
Mailing list archive
-
Message #04041
[Bug 1413836] [NEW] Semi-sync replication performance degrades with high number of threads
Public bug reported:
* Taken verbatim from mysql bug report filed by Rene Cannao
Description:
Based on our experience (issues in productions and easily reproducible
in testing environment), when semi-sync replication is enabled mysqld is
able to handle around 6k TRX/s when the number of threads running is
relatively low (100-200 connections) but degrades when the number of
threads running go beyond a certain threshold. With 3000 threads
running, throughput is no more than 300 TRX/s .
While at low number of threads running the bottleneck seems to be
network rtt , at high number of threads network activity drops, and we
also notice a constantly high number of context switches.
Trying to combine the output of pt-pmp (I think the relevant part is
what is listed below) and semi-sync source code, we believe there is a
high contention on LOCK_binlog_ in order to compare binlog coordinates
of ACK from slave(s) and the binlog coordinates of each thread that
issued a commit, and what seems an inefficient way to wake up threads.
2459 pthread_cond_timedwait,ReplSemiSyncMaster::cond_timewait(semisync_master.so),ReplSemiSyncMaster::commitTrx(semisync_master.so),Trans_delegate::after_commit,MYSQL_BIN_LOG::process_after_commit_stage_queue(binlog.cc:6833),MYSQL_BIN_LOG::ordered_commit(binlog.cc:7217),MYSQL_BIN_LOG::commit(binlog.cc:6609),ha_commit_trans(handler.cc:1500),trans_commit,mysql_execute_command(sql_parse.cc:4711),mysql_parse(sql_parse.cc:6744),dispatch_command(sql_parse.cc:1432),do_handle_one_connection,handle_one_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6),??
541 pthread_cond_wait,Stage_manager::enroll_for(mysql_thread.h:1162),MYSQL_BIN_LOG::ordered_commit(binlog.cc:6897),MYSQL_BIN_LOG::commit(binlog.cc:6609),ha_commit_trans(handler.cc:1500),trans_commit,mysql_execute_command(sql_parse.cc:4711),mysql_parse(sql_parse.cc:6744),dispatch_command(sql_parse.cc:1432),do_handle_one_connection,handle_one_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6),??
...
1 __lll_lock_wait(libpthread.so.0),_L_lock_1006(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),ReplSemiSyncMaster::lock(semisync_master.so),ReplSemiSyncMaster::updateSyncHeader(semisync_master.so),Binlog_transmit_delegate::before_send_event,mysql_binlog_send(rpl_master.cc:1325),com_binlog_dump(rpl_master.cc:767),dispatch_command(sql_parse.cc:1644),do_handle_one_connection,handle_one_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6),??
Each transaction thread waiting an ACK does the follow (simplified):
- hold a lock;
- in a loop ( while (is_on()) ) :
- compare the ACK position with its own position;
- it the ACK is still behind, it waits on a condition variable
In reportReplyBinlog , if there is at least one thread that is waiting
an ACK , it sends a broadcast to all the threads.
This also create a contention on LOCK_log in
MYSQL_BIN_LOG::ordered_commit .
What seems to be a design flaw is that all the threads wake up, and
while it is likely that they will all return to the application in a
scenarios with few threads running, with a lot of threads running
perhaps only few will return to the application and most of them will go
back to wait on the same condition variable. This creates a lot of
context switch and CPU get too busy in performing such operations that
is not be able to perform a lot of progress with replication.
Attached is the output of pt-pmp resulting from the follow command:
pt-pmp --iterations=1 --save-samples=/tmp/samples.pmp > /tmp/output.pmp
When this was executed, mysqld was processing the traffic generated by sysbench in a write intensive workload running 3000 threads.
How to repeat:
Setup semi-sync replication.
Run a write intensive workload against the master with few threads. Ex:
sysbench --max-requests=0 --max-time=7200 --test=oltp --mysql-user=rcannao --mysql-password=rcannao --mysql-db=rcannao --db-driver=mysql --db-ps-mode=disable --oltp-table-size=10000000 --oltp-point-selects=1 --oltp-index-updates=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --num-threads=30 run
Run a write intensive workload against the master with a lot of threads. Ex:
sysbench --max-requests=0 --max-time=7200 --test=oltp --mysql-user=rcannao --mysql-password=rcannao --mysql-db=rcannao --db-driver=mysql --db-ps-mode=disable --oltp-table-size=10000000 --oltp-point-selects=1 --oltp-index-updates=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --num-threads=3000 run
As a comparison, run the above workload without semi-sync enabled. Run
this to verify that the bottleneck is semi-sync.
Suggested fix:
Few spin loops before suspending the thread with pthread_cond_timedwait()
** Affects: mysql-server
Importance: Unknown
Status: Unknown
** Affects: percona-server
Importance: Undecided
Status: New
** Tags: i50047
** Bug watch added: MySQL Bug System #75570
http://bugs.mysql.com/bug.php?id=75570
** Also affects: mysql-server via
http://bugs.mysql.com/bug.php?id=75570
Importance: Unknown
Status: Unknown
--
You received this bug notification because you are a member of Ubuntu
Server/Client Support Team, which is subscribed to MySQL.
Matching subscriptions: Ubuntu Server/Client Support Team
https://bugs.launchpad.net/bugs/1413836
Title:
Semi-sync replication performance degrades with high number of threads
To manage notifications about this bug go to:
https://bugs.launchpad.net/mysql-server/+bug/1413836/+subscriptions
Follow ups
References