← Back to team overview

maria-discuss team mailing list archive

Re: Deadlocks in R/W phase of Sysbench

 


Salut Stephane,

I/O does not enter into the equation, the database is
in memory (RAM).


Regards,


JC


Hello, What IO
scheduler are you using for FS and witch one ?deadline or loop is a must do.
Also mounting with noatime can help for write Would you
try such settings i found to make a differenceautocommit =
1  innodb_adaptive_hash_index=0innodb_max_dirty_page_pct=20innodb_write_io_threads
=64innodb_log_file_size =
1024Minnodb_log_file_in_group =
4innodb_log_files_in_group = 4
innodb_thread_concurrency = 0innodb_purge_threads = 8 innodb_change_buffering=noneinnodb_open_files = 16384innodb_file_per_table=1innodb_autoinc_lock_mode =
2 And comment#
innodb_lru_scan_depth =
4096  Stéphane Varoqui, Senior
ConsultantPhone: +33 695-926-401, skype:
svaroquihttp://www.mariadb.com ;
  Le 9 juin 2017 à 16:22, J. Cassidy
sean@xxxxxxxxxxxx> a écrit : Mark,


still scratching head...


Regards,


JC


 Thanks. I was hoping there was an easy fix
if you did something like used serializable isolation, but you haven't done
that.On Fri, Jun 9, 2017 at 12:27 AM, J.
Cassidy sean@xxxxxxxxxxxx> wrote:
All,
as discussed
yesterday, here is my setup and configuration.


*************************************************************
16 VCPU 64GB
Memory, DB size 12 GB (In RAM), MariaDB 10.1.24*
Single Image. No clustering
or outside network interaction *
SuSE SLES 12 SP2 using a 4.4.49-92 kernel
*
*************************************************************
###
Sysbench R/W - 64 Threads

sysbench 0.5: multi-threaded system
evaluation benchmark

Running the test with following options:
Number of threads: 64
Report intermediate results every 2 second(s)
Initializing random number generator from seed (42).


Threads
started!

[ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes:
204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00
[ 4s]
threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time:
4.65ms (99%), errors: 5.00, reconnects: 0.00
[ 6s] threads: 64, tps:
46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%),
errors: 2.50, reconnects: 0.00
[ 8s] threads: 64, tps: 45940.12, reads:
643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00,
reconnects: 0.00
[ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes:
183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00
[ 12s]
threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time:
4.80ms (99%), errors: 1.50, reconnects: 0.00
[ 14s] threads: 64, tps:
44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%),
errors: 0.50, reconnects: 0.00
[ 16s] threads: 64, tps: 44634.78, reads:
624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50,
reconnects: 0.00
.
.
.
[ 572s] threads: 64, tps: 24980.71,
reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00,
reconnects: 0.00
[ 574s] threads: 64, tps: 25337.03, reads: 354796.98,
writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00
[ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24,
response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00
[ 578s] threads:
64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms
(99%), errors: 0.00, reconnects: 0.00
[ 580s] threads: 64, tps: 25241.11,
reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00,
reconnects: 0.00
[ 582s] threads: 64, tps: 25146.18, reads: 352056.04,
writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00
[ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85,
response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00
[ 586s] threads:
64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms
(99%), errors: 0.00, reconnects: 0.00
[ 588s] threads: 64, tps: 25070.60,
reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00,
reconnects: 0.00
[ 590s] threads: 64, tps: 25115.20, reads: 351666.84,
writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00
[ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94,
response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00
[ 594s] threads:
64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms
(99%), errors: 0.00, reconnects: 0.00
[ 596s] threads: 64, tps: 24854.73,
reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00,
reconnects: 0.00
[ 598s] threads: 64, tps: 24888.16, reads: 348463.80,
writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00
[ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38,
response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00
OLTP test
statistics:
queries performed:
read: 256244548
write: 73212536
other: 36606172
total: 366063256
transactions: 18302990 (30454.11 per
sec.)
read/write requests: 329457084 (548179.44 per sec.)
other
operations: 36606172 (60908.54 per sec.)
ignored errors: 192 (0.32 per
sec.)
reconnects: 0 (0.00 per sec.)

General statistics:
total
time: 601.0023s
total number of events: 18302990
total time taken by
event execution: 38450.5507s
response time:
min: 0.42ms
avg:
2.10ms
max: 62.26ms
approx. 99 percentile: 5.62ms

Threads
fairness:
events (avg/stddev): 285984.2188/3788.79
execution time
(avg/stddev): 600.7899/0.00


=================================================================================================================

### Excerpt from MariaDB Error Log (Thousands of similiar messages).
.
.
.
2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock
found when trying to get lock; try restarting transaction
2017-06-08 8:10:44
4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4353087458576 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4353073515792 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4360227392784 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4353073515792 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
.
.
.

===================================================================================================================
The configuration below has been tried and tested and works across on different
architectures e.g. S390X and S390X.


[mysqld_safe]
user=root
password=''
[mysqld]
symbolic-links=0
open_files_limit = 65535 # was 102400
max_allowed_packet = 16M # was
768M
max-connect-errors = 1000000
connect_timeout = 120 # was 60
performance-schema = false
net_read_timeout = 720
log-warnings=2
#####files and sockets
# innodb_log_group_home_dir= /data-lun22/logs
pid-file=/tmp/mysqld.pid.sysbench
port = 3306
#
log-error=/var/log/mariadb/mariadb-error-john.log
log-error=/var/log/mariadb/mariadb-error-john.log
general_log_file=/var/log/mariadb/mariadb-john.log
long_query_time = 0
slow_query_log = 0
# slow_query_log_file =
/var/log/mariadb/mariadb-slowquery.log
# general-log
#
general_log_file=/var/log/mariadb/mariadb-general.log
##### MariaDB temporary
tables
# tmpdir in RAM
tmpdir = /mnt/tmpdir
#####non innodb options
(fixed)
max_connections = 4000
back_log = 150
table_open_cache =
8000 # was 2048 # was 4000 # was 8000
key_buffer_size = 16M
query_cache_type = 0 #was 1 # was 0
join_buffer_size = 32K
sort_buffer_size = 32K
autocommit = 0
thread-cache-size = 50
table-definition-cache = 1024
table-open-cache = 2048
####fixed innodb
options
innodb_file_per_table = true
innodb_open_files = 4096 # was 8192
# was 4096
innodb_data_file_path = ibdata1:250M:autoextend
innodb_flush_log_at_trx_commit = 2
innodb_flush_method =
O_DIRECT_NO_FSYNC
innodb_log_buffer_size = 256M
innodb_log_file_size =
4G # was 14G # was 4G
innodb_log_files_in_group = 2
innodb_buffer_pool_size = 36G #was 8G # was 48G
innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32
innodb_adaptive_hash_index_partitions = 32
innodb_thread_concurrency = 0
####tuning for RAM disk
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_io_capacity = 8000 # was 20000
innodb_io_capacity_max = 40000
innodb_lru_scan_depth = 4096
innodb_purge_threads = 2
innodb_read_io_threads = 2
innodb_write_io_threads = 2


Note NMON reports 35% memory usage
in the middle of the abovementioned Banchmark ergo the DB is in cache. No
paging.

++ Note The R/W TPS rate are "respectable" based on
observations from countless other runs ran on other
configurations/architecture.
The TPS rates would obviously improve if the
deadlock situation outlined above was resolved.


So there you have
it. Any tips / pointers that would enlighten me are much appreciated.

Regards,


JC


Privet Sergey,

do not
think I have a problem with my config(s).

I am not on the machine at
the moment.
>From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools,
logs etc. tailored to this configuration, the TPS rate is

64 threads -
R/O 73K R/W 26K
128 threads - R/O 71K R/W 23K.

I am putting the DB
in RAM of course to obviate I/O issues. I am interested in seeing what is
happening with the CPU caches and nest..

NMON tells me that for the R/O
runs I have approximately 85% User, 12-14% System and the rest Idle - this stays
consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is
reasonable.

On the R/W runs, NMONs CPU usage display shows that some
CPUs are barely firing, or only intermittently and the granularity I had with R/O
is gone. This would seem to tie in with what I see in the MariaDB log - headwinds,
violent storms and deadlocks.

Any further information you require,
please let me know.



Regards,


JC

Hi!

I just tried this on x86: 40 threads against single table having
10k rows.
No deadlocks, "tps: 12372.40, reads: 173219.52, writes:
49490.51".

This is with latest snapshot of 10.1 and sysbench
1.0.

I believe you shouldn't see deadlocks on such a big data set.
Probably
something is wrong with your configs?

Regards,
Sergey

On Thu, Jun 08, 2017 at 04:51:31PM +0200, J.
Cassidy wrote:
>
>
> Hello Mark,
>
>
appreciate the reply.
>
> The OLTP table size and
> OLTP
size give me a DB size of appx 12GB. This is what I want. I have in the
>
meantime, looked at some older logs and see that whatever amount of threads I
> specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in
the
> R/W phase
> of Sysbench. I even dropped in two different
MariaDB release levels
> (10.0.19, 10.1.24) to see whether it would make a
difference, deadlock still
> there. I am using Sysbench 0.5 for these
tests.
> I am currently using MariaDB
> 10.1.24 - built directly
on the machine. S390x by the way, but the problem is
> also
>
occuring on an X86-64 box.
>
> Regards,
>
>
> JC
>
>
>
> I run sysbench frequently with
MySQL
> (not MariaDB yet) and don't recall ever seeing this error.
But I use much
> larger values for --oltp-table-size. You used 85937 and I
use 1M or larger.
> Maybe
> there is too much data contention with
smaller tables, even if a large number of
> tables is used (128 in your
case). I use a smaller number of tables - between 4
> and 20. On Thu, Jun
8, 2017 at 12:48 AM, J. Cassidy sean@xxxxxxxxxxxx>
wrote:
> Hello all,
>
>
> have used Sysbench
extensively to test MariaDB CPU load factors between various
>
architectures.
>
> I am noticing massive amounts of "[ERROR]
mysqld:
> Deadlock found when trying to get lock; try restarting
transaction" in the
> MariaDB error log during the R/W phase. The
Sysbench R/W TPS rates are still
> respectable for my purposes, but how do
I
> correct this condition?? Can I
> ignore this / must live with
it?
> I have tried three different MariaDB
> releases (10.0.19,
10.1.12, 10.1.24) to narrow things down, but this message (x
>
1000's) is happening across all releases in the Sysbench R/W phase.
>
> Here is the Sysbench command that drives the workload against
MariaDB -
>
> sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
> --rand-seed=42 --rand-type=uniform
--num-threads=128 --oltp-read-only=off
> --report-interval=2
--mysql-socket=/var/lib/mysql/mysql.sock --max-time=201
> --max-requests=0
--mysql-user=root --percentile=99 run
>
> Would anyone
>
have a tip / idea /pointer?
>
>
> Regards,
>
>
> JC
>
>
> John Cassidy
>
>
Obere Bühlstrasse 21
> 8700 Küsnacht
> (ZH)
>
Switzerland / Suisse / Schweiz
>
>
> Mobile: +49 152 58961601 (Germany)
> Mobile: +352 621 577 149
(Luxembourg)
> Mobile: +41 78 769 17 97 (CH)
> Landline: +41 44 509 1957
>
> http://www.jdcassidy.eu
>
> "Aut
viam inveniam aut
> faciam" - Hannibal.
>
_______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
>
Post to :
> maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
>
More help : https://help.launchpad.net/ListHelp
> --Mark Callaghan
> mdcallag@xxxxxxxxx
>
> John
Cassidy
>
> Obere Bühlstrasse 21
> 8700
>
Küsnacht (ZH)
> Switzerland / Suisse / Schweiz
>
>
> Mobile:
> +49 152 58961601 (Germany)
> Mobile: +352 621 577 149 (Luxembourg)
> Mobile:
> +41 78 769 17 97 (CH)
> Landline: +41 44 509 1957
>
> http://www.jdcassidy.eu
>
> "Aut
viam inveniam aut faciam" -
> Hannibal.

>
_______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
>
Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
>
More help : https://help.launchpad.net/ListHelp




John
Cassidy

Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal.

John Cassidy

Obere Bühlstrasse
21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp --Mark
Callaghan
mdcallag@xxxxxxxxx

John Cassidy

Obere Bühlstrasse
21
8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601
(Germany)
Mobile:
+352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal._______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp

John Cassidy

Obere Bühlstrasse
21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut faciam" -
Hannibal.

Follow ups

References