← Back to team overview

maria-discuss team mailing list archive

Re: MariaDB / Sysbench TPS question

 

Hi,

There are a number of things it could be.  I've provided data collection instruction if these suggestions don't help, but it is most likely a combination of the three following issues.

A). You might be dirtying your buffer pool at a rate at which flushing can not keep up.  The most likely reasons are that you have innodb_io_capacity too low (and perhaps innodb_lru_seek_depth).  You have innodb_io_capacity=4000.  Since you have sysbench, try a 16kb random io write test and see how many IOPS you get.  You may have to adjust this value up or down accordingly.  You should set it to about 60-80% of your actual write IO capacity, because you need IOPS for reads, and this variable only takes into account writes.  You could also try doing a random read/write test (16kb) and set it to 1/3 the IOPS you get, and set innodb_io_cspacity_max=1/2 of it. 

B) you definitely have too many buffer pools.  You should use a buffer pool per 8 gb, so you should use six or seven buffer pools.  This should seriously reduce the flush rate and yield better overall results.

C) for smooth transaction rates with many threads writing to disk, 4GB log size is almost certainly too small.  Try 16GB logs.  

Try C) this first to see if it smooths out your ride.  If it just moves it to later, try the above changes which affect flushing.  If it fixes it, try the above changes anyway, as you'll have a baseline to compare them too, but definitely reduce the number of buffer pools and compare.

On a side note:
Are you using one disk or multiple disks?  Is this a SAN?  If it is a SAN, then using DIRECT IO is probably a bad idea, but shouldn't cause periodic drops.

Also no warranty on advice, if it causes your database to nuclear detonate, sorry, and as always YMMV, tuning is an art, not a science.

To collect diag info if above suggestions fail:
-----------------

Please run sysbench for six minutes.

At 30 second intervals execute:
SHOW ENGINE INNODB STATUS\G
SHOW GLOBAL STATUS;

In addition(only run these once, terminate after sysbench)
iostat -kx 15> disk.txt &
vmstat 15 > mem.txt &

Or get pt-stalk from Percona toolkit, which will collect this info (and a bunch more) automatically.

--Swany

Sent from my iPhone

> On Oct 7, 2015, at 8:14 AM, J. Cassidy <sean@xxxxxxxxxxxx> wrote:
> 
> Hello all,
> 
> 
> I am running MariaDB 10.1.5 with Sysbench 0.5 (lua) against an IBM Blade
> (Haswell) using Hitachi 15K SAS Hdd's.
> 
> Operating system is RHEL 7. Hyperthreading is turned off. System memory is
> 48GB.
> 
> Up to now, I have ran extensive Sysbench tests against MariaDB using RAM
> as the datastore with no appreciable problems.
> 
> Now I have switched the datastore to normal disk (see above) and am seeing
> the Sysbench R/W TPS rate take a hit after approximately 3 minutes (180
> seconds) into the run (runtime is 600 seconds) and every three minutes
> thereafter. I am seeing nothing in the MariaDB logs (error and Slow
> queries), nor in Linux itself, nor is Sysbench telling me anything (I have
> debug rurned on and verbosity set to 5).
> 
> 
> 
> Here is an excerpt from one of the R/W Sysbench runs
> 
> 
> 
> [ 168s] threads: 128, tps: 26262.00, reads: 367679.02, writes: 105188.51,
> response time: 50.12ms (99%), errors: 0.00, reconnects:  0.00
> [ 170s] threads: 128, tps: 25913.49, reads: 362826.86, writes: 103670.96,
> response time: 49.08ms (99%), errors: 0.00, reconnects:  0.00
> [ 172s] threads: 128, tps: 26271.51, reads: 367736.58, writes: 105009.52,
> response time: 50.34ms (99%), errors: 0.00, reconnects:  0.00
> [ 174s] threads: 128, tps: 26059.52, reads: 364822.73, writes: 104187.07,
> response time: 44.69ms (99%), errors: 0.00, reconnects:  0.00
> [ 176s] threads: 128, tps: 26394.00, reads: 369573.99, writes: 105652.50,
> response time: 52.28ms (99%), errors: 0.00, reconnects:  0.00
> [ 178s] threads: 128, tps: 26248.49, reads: 367422.90, writes: 104927.97,
> response time: 49.85ms (99%), errors: 0.00, reconnects:  0.00
> [ 180s] threads: 128, tps: 26549.00, reads: 371726.52, writes: 106166.01,
> response time: 60.01ms (99%), errors: 0.00, reconnects:  0.00
> [ 182s] threads: 128, tps: 21493.92, reads: 301020.88, writes: 86033.18,
> response time: 47.29ms (99%), errors: 0.00, reconnects:  0.00
> [ 184s] threads: 128, tps: 174.00, reads: 2435.96, writes: 706.99,
> response time: 2245.86ms (99%), errors: 0.00, reconnects:  0.00
> [ 186s] threads: 128, tps: 185.50, reads: 2597.00, writes: 760.50,
> response time: 2334.28ms (99%), errors: 0.00, reconnects:  0.00
> [ 188s] threads: 128, tps: 269.00, reads: 3766.00, writes: 1066.50,
> response time: 2314.11ms (99%), errors: 0.00, reconnects:  0.00
> [ 190s] threads: 128, tps: 153.50, reads: 2149.00, writes: 619.50,
> response time: 2121.06ms (99%), errors: 0.00, reconnects:  0.00
> [ 192s] threads: 128, tps: 552.00, reads: 7728.00, writes: 2199.50,
> response time: 1440.33ms (99%), errors: 0.00, reconnects:  0.00
> [ 194s] threads: 128, tps: 615.00, reads: 8610.00, writes: 2450.00,
> response time: 1414.69ms (99%), errors: 0.00, reconnects:  0.00
> [ 196s] threads: 128, tps: 908.00, reads: 12712.00, writes: 3650.00,
> response time: 950.93ms (99%), errors: 0.00, reconnects:  0.00
> [ 198s] threads: 128, tps: 1579.00, reads: 22052.47, writes: 6283.99,
> response time: 932.61ms (99%), errors: 0.00, reconnects:  0.00
> [ 200s] threads: 128, tps: 1562.00, reads: 21921.53, writes: 6257.01,
> response time: 748.43ms (99%), errors: 0.00, reconnects:  0.00
> [ 202s] threads: 128, tps: 4166.00, reads: 58323.95, writes: 16652.49,
> response time: 445.78ms (99%), errors: 0.00, reconnects:  0.00
> [ 204s] threads: 128, tps: 11574.99, reads: 161931.90, writes: 46264.97,
> response time: 263.77ms (99%), errors: 0.00, reconnects:  0.00
> [ 206s] threads: 128, tps: 17420.50, reads: 244004.98, writes: 69714.50,
> response time: 103.69ms (99%), errors: 0.00, reconnects:  0.00
> [ 208s] threads: 128, tps: 19833.49, reads: 277626.90, writes: 79424.97,
> response time: 93.88ms (99%), errors: 0.00, reconnects:  0.00
> [ 210s] threads: 128, tps: 24994.00, reads: 349910.48, writes: 99975.49,
> response time: 48.12ms (99%), errors: 0.00, reconnects:  0.00
> [ 212s] threads: 128, tps: 25129.50, reads: 351825.03, writes: 100549.51,
> response time: 52.12ms (99%), errors: 0.00, reconnects:  0.00
> [ 214s] threads: 128, tps: 24982.99, reads: 349739.36, writes: 99821.46,
> response time: 47.66ms (99%), errors: 0.00, reconnects:  0.00
> 
> My.cnf -
> 
> [mysqld_safe]
> user=root
> password=''
> [mysqld]
> open_files_limit = 102400
> max_allowed_packet = 768M
> connect_timeout = 120   # was 60
> performance-schema = false
> net_read_timeout = 720
> log-warnings=2
> #####files and sockets
> pid-file=/tmp/mysqld.pid.sysbench
> log_warnings=2
> log-error=/var/log/mariadb/mariadb-error-john.log
> long_query_time = 1
> slow_query_log = 1
> slow_query_log_file = /var/log/mariadb/mariadb-slowquery.log
> log_queries_not_using_indexes = 1
> # general-log
> # general-log-file=/var/log/mariadb/mariadb-general.log
> # general_log_file=/var/log/mariadb/mariadb-general.log
> ##### MariaDB temporary tables
> tmpdir = /mnt/ramdisk
> 
> #####non innodb options (fixed)
> max_connections = 8000 # was 4000
> back_log = 150
> table_open_cache = 4000 # was 8000
> key_buffer_size = 16M
> query_cache_type = 1 # was 0
> join_buffer_size = 32K
> sort_buffer_size = 32K
> 
> #####use InnoDB plugin
> #ignore-builtin-innodb
> #plugin-load=innodb=ha_innodb.so
> 
> #####fixed innodb options
> innodb_file_per_table = true
> innodb_open_files = 8192 # was 4096
> innodb_data_file_path = ibdata1:250M:autoextend
> innodb_flush_log_at_trx_commit = 2
> innodb_flush_method = O_DIRECT_NO_FSYNC
> innodb_log_buffer_size = 256M
> innodb_log_file_size = 4G
> innodb_log_files_in_group = 2
> innodb_buffer_pool_size = 40G
> innodb_buffer_pool_instances = 32
> innodb_adaptive_hash_index_partitions = 32
> innodb_thread_concurrency = 0
> 
> #####Power8 specific
> #innodb_spin_wait_delay = 0
> 
> #####tuning for SAN storage
> innodb_adaptive_flushing = 1
> innodb_flush_neighbors = 1
> innodb_io_capacity = 4000
> innodb_io_capacity_max = 6000
> innodb_lru_scan_depth = 4096
> innodb_purge_threads = 2
> innodb_read_io_threads = 8    # was 8
> innodb_write_io_threads = 16
> 
> ####tuning for RAM disk
> #innodb_adaptive_flushing = 1
> #innodb_flush_neighbors = 0
> #innodb_io_capacity = 20000
> #innodb_io_capacity_max = 40000
> #innodb_lru_scan_depth = 4096
> #innodb_purge_threads = 2
> #innodb_read_io_threads = 2
> #innodb_write_io_threads = 2
> 
> Would anyone be able to enlighten me as to why I am getting this TPS
> performance hit after approximately three minutes?
> 
> 
> Any pointers much appreciated.
> 
> 
> Regards,
> 
> 
> Aubrey
> 
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to     : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp


References