maria-developers team mailing list archive

Thread
Date
Perf tuning for hot key updating with high concurrency

To: maria-developers@xxxxxxxxxxxxxxxxxxx
From: Hui Liu <hickey.liu@xxxxxxxxx>
Date: Sun, 22 Sep 2013 15:56:15 +0800
It's an old issue which discussed in Percona-google-group:
https://groups.google.com/forum/?fromgroups#!topic/percona-discussion/z3r4-Qm0oYg

The patch provided in that discuss group is a rough idea, based on
Percona-Server, and later I almost re-wrote the patch for real-case testing.
I think it's more simple to implement in the InnoDB layer, compared to the
the idea post by Monty, that introduce 'Incremental lock' or 'Relative
update lock' to solve this problem.

I backported them to Mariadb-10.0.4, and use simple simulate test with
oltp_cc.lua, which is provided as an attachment.

These notes migth be helpful to understand of this patch.

1) Why InnoDB's concurrency control is BAD for high concurrency hot key
updating case?
It's described in the percona-disscuss-group, please refer that.

2) What's the basic logic for high concurrency hot key updating?
The idea is still the same as before, but the implement is different now.
With a simple case that 1000 threads updating the same record id=1 and
threshold=2 to describe the basic idea for this patch.

conn (1000) ... (6) -> (5) -> (4)
(1) (running)
(2) (suspended)
(3) (suspended)

That is, thread 1 is running, and thread 2 and 3 are suspended, waiting for
thread 1 relase the record lock for id=1.


For thread 4, it detected they are already two threads suspended, and it
would be rescheduled in hash queue.
conn  (1000) .. (6) -> (5)
(1) (running)
(2) (suspended)
(3) (suspended)
bucket1: (4)
bucket2:

And for 5 and 6 ... 1000, they are also rescheduled in that queue
(1) (running)
(2) (suspended)
(3) (suspended)
bucket1: (1000)...(6) -> (5) -> (4)
bucket2:

After 1 is committed, 2 is waked up and running, then 4 detects it would
entry now, so it enters.
(1) (commit/exit)
(2) (running)
(3) (suspended)
(4) (suspended)
bucket1: (1000) ... (6) -> (5)
bucket2:

The same logic for 5 and 6...1000.

While for the old logic, a snapshot might be:
(1) (commit/exit)
(2) (running)
(3) (suspended)
(4) (suspended)
(5) (suspended)
....
(1000) (suspended)


Hope the demo is not too naive.

You can see, in such case, there are always threshold threads running
inside innodb, while not all are suspended inside.

3) Why not use the new concurrency control with AOTMIC add/sub for active
conc?
The result is bad for new concurrency control in 5.6/10.0, that no FIFO any
more. So I changed to default (5.1/5.5) control strategy again.
Note the changes in CMakeLists.txt that make -DUSE_DEFAULT_CONC=ON as
default.

4) How about the concurrency counter for rescheduled thread in bucket queue?
If thread number waiting for the record lock is over the configured
threshold, it would be rescheduled into queue in one hash bucket, hashed by
(space,space_no, heap_no)
a) for the first one in the queue, that is the queue header, it would
timed-wait retrying whether it would enter into innodb layer, by checking
the n_ref_count for that record stored by HASH.
b) for the non-head threads, they would be FAKE released the concurrent
slot, that means threads in FIFO would be waked up, enter/exit innodb with
faked inside and tickets. Until they are force to exit innodb, the active
slot number is equal to real case without rescheduling.


5) What about the basic test result?
$ ./sysbench/sysbench --test=sysbench/tests/db/oltp_cc.lua
--num-threads=1000 --mysql-socket=/u01/md1004/run/mysql.sock --mysql-db=bmw
--oltp-tables-count=4 --mysql-user=root --oltp-table-size=4
--max-requests=0 --max-time=120000 --oltp-read-only=off run
1000conn, oltp-cc.lua, 4x4(16 hotkey):
patched (innodb_lock_wait_threshold=12 as default)  : 3.1K QPS, 9.1K TPS
unpatched (innodb_lock_wait_threshold=0 as disabled) : 1.2K QPS, 3.5K TPS

$ ./sysbench/sysbench --test=sysbench/tests/db/oltp_cc.lua
--num-threads=1000 --mysql-socket=/u01/md1004/run/mysql.sock --mysql-db=bmw
--oltp-tables-count=1 --mysql-user=root --oltp-table-size=4
--max-requests=0 --max-time=120000 --oltp-read-only=off run
1000conn, oltp-cc.lua, 1x4(4 hotkey):
patched (innodb_lock_wait_threshold=12 as default)  : 1.3K QPS, 3.5K TPS
unpatched (innodb_lock_wait_threshold=0 as disabled) : 200 QPS, 720 TPS


6) Any rest issues?
The rescheduling of the waiting thread breaks the normal mechanism: each
lock request must be registered in lock-table, then deadlock could be found
if exists.
As I am trying to reduce n in lock relaetd O(n) and O(n^2) operations, for
the deadlock case, it does not work very well.

For example, the update sequance:
 t1 updates 1, and t2 updates 1, then t3/t4/t5/t6 update 1, and then t1
updates 2, after that t2 updates 1:

t1 1 2
t2 2 1
t3 1
t4 1
t5 1
t6 1

As we can see, in the new schedule model, t3/t4/t5 are suspended, and might
be rescheduled in queue.
Then t1 is suspended as it updates 2 which record lock is hold by t2. t2 is
rescheduled as it detected too many lock waiting for rec 1.
The snapshot might be:
(t1[2]) (suspended)
(t3) (suspended)
(t4) (suspended)
bucket1:  (t2[1]) -> (t6) -> (t5)
bucket2:
The deadlock can be break until t2 enters with 1. Then for t5/t6, they
would wait until timeout, and then force enter into innodb side, suspended
as it should be. It's same for t2. Until t2 enters, the deadlock is found.
So, the cost for deadlock detecting might be much large than it should be,
depends on the timeout and retry counts.


Hope Jan or Monty would have a look for this patch, or other ideas for
solve this problem, if you are going.
Attachment: oltp_cc.lua
Description: Binary data
Attachment: hot_key_model.diff
Description: Binary data