← Back to team overview

maria-developers team mailing list archive

Re: a doubt about locks on supremum


Hi! I am very sorry for the very much delayed reply, but I
accidentally found this message today while looking for something

On Mon, May 29, 2017 at 11:08 AM, hongxiang jiang <jhx1008@xxxxxxxxx> wrote:
> hi everyone:
> I use sysbench to do some tests on MySQL 5.7.x,   my table like this:
> CREATE TABLE `sbtest1` (
>   `id` int(11) NOT NULL AUTO_INCREMENT,
>   `k` int(11) NOT NULL DEFAULT '0',
>   `c` char(120) NOT NULL DEFAULT '',
>   `pad` char(60) NOT NULL DEFAULT '',
>   PRIMARY KEY (`id`),
> ) ENGINE=InnoDB;
> secondary index is not neccessary,  I do not create it!!
> I have two sessions:
> session1: begin; select * from sbtest1 where id=123456 for update
> session2: insert into sbtest1 values(123450, 0,'aaa','bbb');
> the records in the table like below:
> (...),(123440, 264454, xxxxxx, xxxxx), (123456, 251028, xxxxx, xxxxxx),(....)
> in my opinion,  the record with the id=123456 existed, so  session1 add the X lock on this record,  session2 add the insert intention lock between(123440, 123456), these two lock are not conflicted.  session2 will not be blocked by session1.
> unfortunately, in my case,  session2 is blocked by session1, what happens?
> when session2 is blocked,  I do "select * from information_schema.INNODB_LOCKS" use another session, It shows :
> mysql> select * from information_schema.INNODB_LOCKS;
> +----------------+-------------+-----------+-----------+------------------+------------+------------+-----------+----------+------------------------+
> | lock_id        | lock_trx_id | lock_mode | lock_type | lock_table       | lock_index | lock_space | lock_page | lock_rec | lock_data              |
> +----------------+-------------+-----------+-----------+------------------+------------+------------+-----------+----------+------------------------+
> | 2388:29:2772:1 | 2388        | X         | RECORD    | `mydb`.`sbtest1` | PRIMARY    |         29 |      2772 |        1 | supremum pseudo-record |
> | 2387:29:2772:1 | 2387        | X         | RECORD    | `mydb`.`sbtest1` | PRIMARY    |         29 |      2772 |        1 | supremum pseudo-record |
> +----------------+-------------+-----------+-----------+------------------+------------+------------+-----------+----------+------------------------+
> the two sessions are locked on "supremum pseudo-record" in the same time.

The INNODB_LOCKS does not appear to display the type of the record
lock in sufficient detail. I do not think that there ever can exist a
lock on a page supremum record itself. What can exist is a gap lock
that prevents any INSERT after the last record on the page. The
LOCK_MODE should really distinguish "record" and "record+gap" and
"gap-only" locks. I see that this information comes from
lock_get_mode_str(). It can return "X" or "X,GAP" but not "GAP" which
is a possible mode. Either way, the "X" output seems to indicate that
the lock on the page supremum is not a gap lock, but really locking
the pseudo-record.

I think that in your case, session1 only needs to one record lock,
because the WHERE condition is unique for the PRIMARY KEY. If the
record is found, then that single record should be locked (exclusively
because of the FOR UPDATE). Else, it should lock the gap that prevents
the record from being inserted. Say, if the surrounding records are
id=123440 and id=123460, then the gap lock should be attached to the
id=123460 record, and it would cover all keys between these records.
If id=123440 was the last record on the page, then the gap lock should
be acquired on the page supremum.

Side note 1: INNODB_LOCKS might only show conflicting locks.
Side note 2: INSERT normally uses 'implicit locks', that is, no lock_t
object is created. Only if some other transaction is trying to lock a
record while the INSERT transaction is still active, then that
transaction would convert the implicit lock to an explicit one on
behalf of the thread that ran the INSERT, and then create a wait lock
request object. Only after this point, INNODB_LOCKS would list the

> then I have debugged this instance(row0sel.cc::row_search_mvcc), I got that: when a record existed on the page for the first one, if you want to add the X Lock on it, the supremum of pre-page will be add X Lock(maybe next key lock, supremum can be add gap lock only) also.
> In my case, the records' primary key stored like this:
> (...,  123440, supremum), (infimum, 123456, ...), the id=123456 is the first record of that page, when add X Lock,  first, find the pre-record id=123440, and then add the X lock after this record until match the record witch id=123456, so the supremum is locked also..., because of this lock(supremum), session2 is blocked.  if id=123456 is not the first record of page,   session2 will not be blocked.

OK, this sounds indeed like a bug. Can you please file it at
https://jira.mariadb.org, including a test case and possible code
patch that repeats it?

Side note 3: If it was SELECT * FROM sbtest1 WHERE id=123441 FOR
UPDATE, then we should only lock the gap on the page supremum. This
page supremum would prevent inserts of keys between 123441 and 123456
(the first user record on the next page).

> so my doubt is:    in my case,  the session1 lock the supremum is necessary? if it's necessary,  the lock on supremum maybe resulting the difference even if two mysql instances have the same records,  just as in a replication.

The lock on the page supremum does not appear to be necessary in your
case. And I would like to get a test, so that I can fix it in MariaDB.

> the record located on the page not all the same even if two instances have the same table sturcture & data(if slave define the table compressed, or innodb page_size not set as the same as master, etc..).  if lock on supremum is neccessary,  two instances execute the same SQLs, master is OK, but slave conflicted? on master, the two transactions may be committed in one group,  but on slave maybe conflicted,  slave's parallell replication is OK?

Yes, when it comes to logical replication, I am sure that there are
many examples of this. Because rolled-back transactions would not be
replicated and because InnoDB modifies index trees in a
nondeterministic 'purge' task in the background, the page splits will
necessarily always be different between the master and slaves. Also,
if a gap lock is attached to a purgeable delete-marked record, purge
would widen the gap. If this widening happens on the slave but not on
the master, the transaction would see different locking conflicts.

> also in MySQL 5.7.x,  the xa prepare recorded in binlog,  if on the master, execute two xa transactions:
> xa start 'aaa'; delete from table_name where id=xxxx; xa end 'aaa';xa prepare 'aaa';
> xa start 'bbb';insert into table_name values(xxxx); end 'bbb'; xa prepare 'bbb';
> xa commit 'aaa'; xa commit 'bbb'
> on the master, these xa not conflicted,  I can committed these transactions,  but  on the slave,  maybe conflicted sometimes,  the replication is broken.

I just discussed this with my colleagues yesterday. MariaDB is yet to
fix some XA bugs, such as the famous
Bug #12161 Xa recovery and client disconnection

which Andrei Elkin fixed together with me in MySQL 5.7. Now we both
are MariaDB employees.

Andrei argued that this can be avoided by not replicating XA
transactions until XA COMMIT.
I think that this is a perfect solution for single-master replication.
For multi-master replication, the locks of XA PREPARE transactions
must be taken into account somehow.

Again, please file a bug with the exact scripts or steps to reproduce,
so that we can fix this in MariaDB.

Best regards,

Marko Mäkelä, Lead Developer InnoDB
MariaDB Corporation



MariaDB User Conference

February 26 - 27, 2018

New York City