← Back to team overview

maria-discuss team mailing list archive

Re: Deadlocks in R/W phase of Sysbench

 


Stephane,

all of my directories, logs, spaces, tables etc. are in RAM.
During the course of the benchmark there in ZERO I/O
which is my
intention.

Here is a snippet from the MariaDB KB
-autocommit
By default, MariaDB runs with autocommit
mode enabled. This means that as soon as you execute a statement that updates
(modifies) a table, MariaDB stores the update on disk to make it permanent. To
disable autocommit mode, use the following statement:SET autocommit=0;
So it would seem that I am running with autocommit enabled.

Regards,


JC

Re, Yep get this but just that io a supposed to
be fast but still have to hit FS moved from PB to redo log to tblspaceIn a
perfect fast world all layer should have equal ressources so no reasons not giving
as many threads to write io vs doing the work itself of modifying
pages. Disabling auto commit with sysbench is
suspicious and intersting : never question myself on this topic before
? /svarStéphane Varoqui, Senior
ConsultantPhone: +33 695-926-401, skype:
svaroquihttp://www.mariadb.com ;
  Le 9 juin 2017 à 16:47, J. Cassidy
sean@xxxxxxxxxxxx> a écrit : Salut Stephane,

I/O does not enter into the equation, the database is in memory (RAM).


Regards,


JC

 Hello, What IO scheduler are you using for
FS and witch one ?deadline or loop is a must do. Also mounting with noatime
can help for write Would you try such settings i found
to make a differenceautocommit =
1  innodb_adaptive_hash_index=0innodb_max_dirty_page_pct=20innodb_write_io_threads
=64innodb_log_file_size =
1024Minnodb_log_file_in_group =
4innodb_log_files_in_group = 4
innodb_thread_concurrency = 0innodb_purge_threads = 8 innodb_change_buffering=noneinnodb_open_files = 16384innodb_file_per_table=1innodb_autoinc_lock_mode =
2 And comment#
innodb_lru_scan_depth =
4096  Stéphane Varoqui, Senior
ConsultantPhone: +33 695-926-401, skype:
svaroquihttp://www.mariadb.com Le
9 juin 2017 à 16:22, J. Cassidy sean@xxxxxxxxxxxx> a écrit :Mark,


still scratching head...


Regards,


JC

 Thanks. I was hoping there was an easy fix if you did
something like used serializable isolation, but you haven't done
that.On Fri, Jun 9, 2017 at 12:27 AM, J.
Cassidy sean@xxxxxxxxxxxx> wrote:
All,
as discussed
yesterday, here is my setup and configuration.


*************************************************************
16 VCPU 64GB
Memory, DB size 12 GB (In RAM), MariaDB 10.1.24*
Single Image. No clustering
or outside network interaction *
SuSE SLES 12 SP2 using a 4.4.49-92 kernel
*
*************************************************************
###
Sysbench R/W - 64 Threads

sysbench 0.5: multi-threaded system
evaluation benchmark

Running the test with following options:
Number of threads: 64
Report intermediate results every 2 second(s)
Initializing random number generator from seed (42).


Threads
started!

[ 2s] threads: 64, tps: 51141.62, reads: 717023.15, writes:
204757.48, response time: 4.72ms (99%), errors: 54.00, reconnects: 0.00
[ 4s]
threads: 64, tps: 47938.19, reads: 671195.72, writes: 191759.28, response time:
4.65ms (99%), errors: 5.00, reconnects: 0.00
[ 6s] threads: 64, tps:
46908.41, reads: 656790.25, writes: 187640.64, response time: 4.65ms (99%),
errors: 2.50, reconnects: 0.00
[ 8s] threads: 64, tps: 45940.12, reads:
643191.64, writes: 183781.47, response time: 4.44ms (99%), errors: 3.00,
reconnects: 0.00
[ 10s] threads: 64, tps: 45789.91, reads: 641083.20, writes:
183161.13, response time: 4.77ms (99%), errors: 2.00, reconnects: 0.00
[ 12s]
threads: 64, tps: 45408.37, reads: 635825.16, writes: 181672.97, response time:
4.80ms (99%), errors: 1.50, reconnects: 0.00
[ 14s] threads: 64, tps:
44319.93, reads: 620440.07, writes: 177262.74, response time: 4.50ms (99%),
errors: 0.50, reconnects: 0.00
[ 16s] threads: 64, tps: 44634.78, reads:
624843.97, writes: 178523.13, response time: 4.71ms (99%), errors: 0.50,
reconnects: 0.00
.
.
.
[ 572s] threads: 64, tps: 24980.71,
reads: 349707.39, writes: 99924.32, response time: 5.92ms (99%), errors: 0.00,
reconnects: 0.00
[ 574s] threads: 64, tps: 25337.03, reads: 354796.98,
writes: 101355.13, response time: 6.19ms (99%), errors: 0.00, reconnects: 0.00
[ 576s] threads: 64, tps: 25196.31, reads: 352683.33, writes: 100777.24,
response time: 6.49ms (99%), errors: 0.00, reconnects: 0.00
[ 578s] threads:
64, tps: 25235.61, reads: 353317.05, writes: 100941.94, response time: 5.94ms
(99%), errors: 0.00, reconnects: 0.00
[ 580s] threads: 64, tps: 25241.11,
reads: 353395.59, writes: 100956.96, response time: 6.09ms (99%), errors: 0.00,
reconnects: 0.00
[ 582s] threads: 64, tps: 25146.18, reads: 352056.04,
writes: 100599.73, response time: 5.96ms (99%), errors: 0.00, reconnects: 0.00
[ 584s] threads: 64, tps: 25242.59, reads: 353362.72, writes: 100978.85,
response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00
[ 586s] threads:
64, tps: 25297.02, reads: 354187.80, writes: 101166.58, response time: 5.89ms
(99%), errors: 0.00, reconnects: 0.00
[ 588s] threads: 64, tps: 25070.60,
reads: 350944.86, writes: 100283.39, response time: 6.33ms (99%), errors: 0.00,
reconnects: 0.00
[ 590s] threads: 64, tps: 25115.20, reads: 351666.84,
writes: 100466.81, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00
[ 592s] threads: 64, tps: 25165.11, reads: 352314.52, writes: 100668.94,
response time: 5.98ms (99%), errors: 0.00, reconnects: 0.00
[ 594s] threads:
64, tps: 25165.99, reads: 352322.89, writes: 100703.97, response time: 6.30ms
(99%), errors: 0.00, reconnects: 0.00
[ 596s] threads: 64, tps: 24854.73,
reads: 347945.66, writes: 99370.90, response time: 6.17ms (99%), errors: 0.00,
reconnects: 0.00
[ 598s] threads: 64, tps: 24888.16, reads: 348463.80,
writes: 99556.16, response time: 6.21ms (99%), errors: 0.00, reconnects: 0.00
[ 600s] threads: 64, tps: 25158.22, reads: 352195.58, writes: 100633.38,
response time: 5.94ms (99%), errors: 0.00, reconnects: 0.00
OLTP test
statistics:
queries performed:
read: 256244548
write: 73212536
other: 36606172
total: 366063256
transactions: 18302990 (30454.11 per
sec.)
read/write requests: 329457084 (548179.44 per sec.)
other
operations: 36606172 (60908.54 per sec.)
ignored errors: 192 (0.32 per
sec.)
reconnects: 0 (0.00 per sec.)

General statistics:
total
time: 601.0023s
total number of events: 18302990
total time taken by
event execution: 38450.5507s
response time:
min: 0.42ms
avg:
2.10ms
max: 62.26ms
approx. 99 percentile: 5.62ms

Threads
fairness:
events (avg/stddev): 285984.2188/3788.79
execution time
(avg/stddev): 600.7899/0.00


=================================================================================================================

### Excerpt from MariaDB Error Log (Thousands of similiar messages).
.
.
.
2017-06-08 8:10:44 4395980287248 [ERROR] mysqld: Deadlock
found when trying to get lock; try restarting transaction
2017-06-08 8:10:44
4353081699600 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4353087458576 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4360227998992 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4353073515792 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4390705949968 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4360227392784 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
2017-06-08 8:10:44 4353073515792 [ERROR] mysqld:
Deadlock found when trying to get lock; try restarting transaction
2017-06-08
8:10:44 4353087458576 [ERROR] mysqld: Deadlock found when trying to get lock; try
restarting transaction
.
.
.

===================================================================================================================
The configuration below has been tried and tested and works across on different
architectures e.g. S390X and S390X.


[mysqld_safe]
user=root
password=''
[mysqld]
symbolic-links=0
open_files_limit = 65535 # was 102400
max_allowed_packet = 16M # was
768M
max-connect-errors = 1000000
connect_timeout = 120 # was 60
performance-schema = false
net_read_timeout = 720
log-warnings=2
#####files and sockets
# innodb_log_group_home_dir= /data-lun22/logs
pid-file=/tmp/mysqld.pid.sysbench
port = 3306
#
log-error=/var/log/mariadb/mariadb-error-john.log
log-error=/var/log/mariadb/mariadb-error-john.log
general_log_file=/var/log/mariadb/mariadb-john.log
long_query_time = 0
slow_query_log = 0
# slow_query_log_file =
/var/log/mariadb/mariadb-slowquery.log
# general-log
#
general_log_file=/var/log/mariadb/mariadb-general.log
##### MariaDB temporary
tables
# tmpdir in RAM
tmpdir = /mnt/tmpdir
#####non innodb options
(fixed)
max_connections = 4000
back_log = 150
table_open_cache =
8000 # was 2048 # was 4000 # was 8000
key_buffer_size = 16M
query_cache_type = 0 #was 1 # was 0
join_buffer_size = 32K
sort_buffer_size = 32K
autocommit = 0
thread-cache-size = 50
table-definition-cache = 1024
table-open-cache = 2048
####fixed innodb
options
innodb_file_per_table = true
innodb_open_files = 4096 # was 8192
# was 4096
innodb_data_file_path = ibdata1:250M:autoextend
innodb_flush_log_at_trx_commit = 2
innodb_flush_method =
O_DIRECT_NO_FSYNC
innodb_log_buffer_size = 256M
innodb_log_file_size =
4G # was 14G # was 4G
innodb_log_files_in_group = 2
innodb_buffer_pool_size = 36G #was 8G # was 48G
innodb_buffer_pool_instances = 8 # 16 # was 6 # was 32
innodb_adaptive_hash_index_partitions = 32
innodb_thread_concurrency = 0
####tuning for RAM disk
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_io_capacity = 8000 # was 20000
innodb_io_capacity_max = 40000
innodb_lru_scan_depth = 4096
innodb_purge_threads = 2
innodb_read_io_threads = 2
innodb_write_io_threads = 2


Note NMON reports 35% memory usage
in the middle of the abovementioned Banchmark ergo the DB is in cache. No
paging.

++ Note The R/W TPS rate are "respectable" based on
observations from countless other runs ran on other
configurations/architecture.
The TPS rates would obviously improve if the
deadlock situation outlined above was resolved.


So there you have
it. Any tips / pointers that would enlighten me are much appreciated.

Regards,


JC


Privet Sergey,

do not
think I have a problem with my config(s).

I am not on the machine at
the moment.
>From memory, with 16 Cores / 32 VCPU, 64 GB memory, buffer pools,
logs etc. tailored to this configuration, the TPS rate is

64 threads -
R/O 73K R/W 26K
128 threads - R/O 71K R/W 23K.

I am putting the DB
in RAM of course to obviate I/O issues. I am interested in seeing what is
happening with the CPU caches and nest..

NMON tells me that for the R/O
runs I have approximately 85% User, 12-14% System and the rest Idle - this stays
consistent throughout the run (300 seconds R/O, 600 seconds R/W) which is
reasonable.

On the R/W runs, NMONs CPU usage display shows that some
CPUs are barely firing, or only intermittently and the granularity I had with R/O
is gone. This would seem to tie in with what I see in the MariaDB log - headwinds,
violent storms and deadlocks.

Any further information you require,
please let me know.



Regards,


JC

Hi!

I just tried this on x86: 40 threads against single table having
10k rows.
No deadlocks, "tps: 12372.40, reads: 173219.52, writes:
49490.51".

This is with latest snapshot of 10.1 and sysbench
1.0.

I believe you shouldn't see deadlocks on such a big data set.
Probably
something is wrong with your configs?

Regards,
Sergey

On Thu, Jun 08, 2017 at 04:51:31PM +0200, J.
Cassidy wrote:
>
>
> Hello Mark,
>
>
appreciate the reply.
>
> The OLTP table size and
> OLTP
size give me a DB size of appx 12GB. This is what I want. I have in the
>
meantime, looked at some older logs and see that whatever amount of threads I
> specify (8, 16,32, 64 or 128), the deadlock messages are still surfacing in
the
> R/W phase
> of Sysbench. I even dropped in two different
MariaDB release levels
> (10.0.19, 10.1.24) to see whether it would make a
difference, deadlock still
> there. I am using Sysbench 0.5 for these
tests.
> I am currently using MariaDB
> 10.1.24 - built directly
on the machine. S390x by the way, but the problem is
> also
>
occuring on an X86-64 box.
>
> Regards,
>
>
> JC
>
>
>
> I run sysbench frequently with
MySQL
> (not MariaDB yet) and don't recall ever seeing this error.
But I use much
> larger values for --oltp-table-size. You used 85937 and I
use 1M or larger.
> Maybe
> there is too much data contention with
smaller tables, even if a large number of
> tables is used (128 in your
case). I use a smaller number of tables - between 4
> and 20. On Thu, Jun
8, 2017 at 12:48 AM, J. Cassidy sean@xxxxxxxxxxxx>
wrote:
> Hello all,
>
>
> have used Sysbench
extensively to test MariaDB CPU load factors between various
>
architectures.
>
> I am noticing massive amounts of "[ERROR]
mysqld:
> Deadlock found when trying to get lock; try restarting
transaction" in the
> MariaDB error log during the R/W phase. The
Sysbench R/W TPS rates are still
> respectable for my purposes, but how do
I
> correct this condition?? Can I
> ignore this / must live with
it?
> I have tried three different MariaDB
> releases (10.0.19,
10.1.12, 10.1.24) to narrow things down, but this message (x
>
1000's) is happening across all releases in the Sysbench R/W phase.
>
> Here is the Sysbench command that drives the workload against
MariaDB -
>
> sysbench --test=lua/oltp.lua --oltp_tables_count=128
--oltp-table-size=85937
> --rand-seed=42 --rand-type=uniform
--num-threads=128 --oltp-read-only=off
> --report-interval=2
--mysql-socket=/var/lib/mysql/mysql.sock --max-time=201
> --max-requests=0
--mysql-user=root --percentile=99 run
>
> Would anyone
>
have a tip / idea /pointer?
>
>
> Regards,
>
>
> JC
>
>
> John Cassidy
>
>
Obere Bühlstrasse 21
> 8700 Küsnacht
> (ZH)
>
Switzerland / Suisse / Schweiz
>
>
> Mobile: +49 152 58961601 (Germany)
> Mobile: +352 621 577 149
(Luxembourg)
> Mobile: +41 78 769 17 97 (CH)
> Landline: +41 44 509 1957
>
> http://www.jdcassidy.eu
>
> "Aut
viam inveniam aut
> faciam" - Hannibal.
>
_______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
>
Post to :
> maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
>
More help : https://help.launchpad.net/ListHelp
> --Mark Callaghan
> mdcallag@xxxxxxxxx
>
> John
Cassidy
>
> Obere Bühlstrasse 21
> 8700
>
Küsnacht (ZH)
> Switzerland / Suisse / Schweiz
>
>
> Mobile:
> +49 152 58961601 (Germany)
> Mobile: +352 621 577 149 (Luxembourg)
> Mobile:
> +41 78 769 17 97 (CH)
> Landline: +41 44 509 1957
>
> http://www.jdcassidy.eu
>
> "Aut
viam inveniam aut faciam" -
> Hannibal.

>
_______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
>
Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-discuss
>
More help : https://help.launchpad.net/ListHelp




John
Cassidy

Obere Bühlstrasse 21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal.

John Cassidy

Obere Bühlstrasse
21
8700 Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601 (Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal.
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp --Mark
Callaghan
mdcallag@xxxxxxxxx

John Cassidy

Obere Bühlstrasse
21
8700
Küsnacht (ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601
(Germany)
Mobile:
+352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97 (CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal._______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp

John Cassidy

Obere Bühlstrasse 21
8700 Küsnacht
(ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601
(Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97
(CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut
faciam" - Hannibal.

John Cassidy

Obere Bühlstrasse 21
8700 Küsnacht
(ZH)
Switzerland / Suisse / Schweiz


Mobile: +49 152 58961601
(Germany)
Mobile: +352 621 577 149 (Luxembourg)
Mobile: +41 78 769 17 97
(CH)
Landline: +41 44 509 1957

http://www.jdcassidy.eu

"Aut viam inveniam aut faciam" - Hannibal.

Follow ups

References