← Back to team overview

enterprise-support team mailing list archive

[Bug 1413836] [NEW] Semi-sync replication performance degrades with high number of threads

 

Public bug reported:

* Taken verbatim from mysql bug report filed by Rene Cannao

Description:

Based on our experience (issues in productions and easily reproducible
in testing environment), when semi-sync replication is enabled mysqld is
able to handle around 6k TRX/s when the number of threads running is
relatively low (100-200 connections) but degrades when the number of
threads running go beyond a certain threshold. With 3000 threads
running, throughput is no more than 300 TRX/s .

While at low number of threads running the bottleneck seems to be
network rtt , at high number of threads network activity drops, and we
also notice a constantly high number of context switches.

Trying to combine the output of pt-pmp (I think the relevant part is
what is listed below) and semi-sync source code, we believe there is a
high contention on LOCK_binlog_ in order to compare binlog coordinates
of ACK from slave(s) and the binlog coordinates of each thread that
issued a commit, and what seems an inefficient way to wake up threads.

   2459 pthread_cond_timedwait,ReplSemiSyncMaster::cond_timewait(semisync_master.so),ReplSemiSyncMaster::commitTrx(semisync_master.so),Trans_delegate::after_commit,MYSQL_BIN_LOG::process_after_commit_stage_queue(binlog.cc:6833),MYSQL_BIN_LOG::ordered_commit(binlog.cc:7217),MYSQL_BIN_LOG::commit(binlog.cc:6609),ha_commit_trans(handler.cc:1500),trans_commit,mysql_execute_command(sql_parse.cc:4711),mysql_parse(sql_parse.cc:6744),dispatch_command(sql_parse.cc:1432),do_handle_one_connection,handle_one_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6),??
    541 pthread_cond_wait,Stage_manager::enroll_for(mysql_thread.h:1162),MYSQL_BIN_LOG::ordered_commit(binlog.cc:6897),MYSQL_BIN_LOG::commit(binlog.cc:6609),ha_commit_trans(handler.cc:1500),trans_commit,mysql_execute_command(sql_parse.cc:4711),mysql_parse(sql_parse.cc:6744),dispatch_command(sql_parse.cc:1432),do_handle_one_connection,handle_one_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6),??
...
      1 __lll_lock_wait(libpthread.so.0),_L_lock_1006(libpthread.so.0),pthread_mutex_lock(libpthread.so.0),ReplSemiSyncMaster::lock(semisync_master.so),ReplSemiSyncMaster::updateSyncHeader(semisync_master.so),Binlog_transmit_delegate::before_send_event,mysql_binlog_send(rpl_master.cc:1325),com_binlog_dump(rpl_master.cc:767),dispatch_command(sql_parse.cc:1644),do_handle_one_connection,handle_one_connection,pfs_spawn_thread,start_thread(libpthread.so.0),clone(libc.so.6),??

Each transaction thread waiting an ACK does the follow (simplified):
- hold a lock;
- in a loop ( while (is_on()) ) :
  - compare the ACK position with its own position;
  - it the ACK is still behind, it waits on a condition variable 

In reportReplyBinlog , if there is at least one thread that is waiting
an ACK , it sends a broadcast to all the threads.

This also create a contention on LOCK_log in
MYSQL_BIN_LOG::ordered_commit .

What seems to be a design flaw is that all the threads wake up, and
while it is likely that they will all return to the application in a
scenarios with few threads running, with a lot of threads running
perhaps only few will return to the application and most of them will go
back to wait on the same condition variable. This creates a lot of
context switch and CPU get too busy in performing such operations that
is not be able to perform a lot of progress with replication.

Attached is the output of pt-pmp resulting from the follow command:
pt-pmp --iterations=1 --save-samples=/tmp/samples.pmp > /tmp/output.pmp
When this was executed, mysqld was processing the traffic generated by sysbench in a write intensive workload running 3000 threads.

How to repeat:
Setup semi-sync replication.

Run a write intensive workload against the master with few threads. Ex:
sysbench --max-requests=0 --max-time=7200 --test=oltp --mysql-user=rcannao --mysql-password=rcannao --mysql-db=rcannao --db-driver=mysql --db-ps-mode=disable --oltp-table-size=10000000 --oltp-point-selects=1 --oltp-index-updates=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --num-threads=30 run

Run a write intensive workload against the master with a lot of threads. Ex:
sysbench --max-requests=0 --max-time=7200 --test=oltp --mysql-user=rcannao --mysql-password=rcannao --mysql-db=rcannao --db-driver=mysql --db-ps-mode=disable --oltp-table-size=10000000 --oltp-point-selects=1 --oltp-index-updates=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --num-threads=3000 run

As a comparison, run the above workload without semi-sync enabled. Run
this to verify that the bottleneck is semi-sync.

Suggested fix:
Few spin loops before suspending the thread with pthread_cond_timedwait()

** Affects: mysql-server
     Importance: Unknown
         Status: Unknown

** Affects: percona-server
     Importance: Undecided
         Status: New


** Tags: i50047

** Bug watch added: MySQL Bug System #75570
   http://bugs.mysql.com/bug.php?id=75570

** Also affects: mysql-server via
   http://bugs.mysql.com/bug.php?id=75570
   Importance: Unknown
       Status: Unknown

-- 
You received this bug notification because you are a member of Ubuntu
Server/Client Support Team, which is subscribed to MySQL.
Matching subscriptions: Ubuntu Server/Client Support Team
https://bugs.launchpad.net/bugs/1413836

Title:
  Semi-sync replication performance degrades with high number of threads

To manage notifications about this bug go to:
https://bugs.launchpad.net/mysql-server/+bug/1413836/+subscriptions


Follow ups

References