← Back to team overview

percona-discussion team mailing list archive

Re: [Bug 378100] Re: Innodb locks up under large INSERT load

 

On Thu, May 21, 2009 at 9:25 PM, Jeremy Kerr <jk@xxxxxxxxxx> wrote:
> LCM: glad to hear you got the backtrace. Here's what I can tell from it:
>
> Most threads are blocked in pthread_cond_wait:
>
> 7 are blocked on log_sys->mutex from mtr_commit()
> 4 threads are each waiting on the elements of os_aio_segment_wait_events[0, 1, 2, 3]->os_mutex
> 1 thread is waiting on buf_pool_mutex from log_close(), but *also* holds the log_sys mutex which the first 7 threads are waiting on.
>
> I'm not aware of any locking dependencies that mean that the threads in
> mtr_commit will hold the os_aio_segment_wait_events->os_mutex, but this
> is entirely possible. Vadim: are you able to confirm/reject this?

The os_aio mutexes should be local to os0file.c. There are indirect
dependencies, but I don't see them here. The indirect dependency is
when a thread gets stuck waiting for IO to complete. Per-page locks
are unlocked by the IO thread when IO requests are completed and other
threads can wait on that. My servers don't always have reliable disks,
so a flaky disk that causes many retries can lock things up for me.
But that doesn't appear to be the case here.

>
> So, the buf_pool_mutex thread could be the culprit here. I'd suggest
> taking a look at the log_sys->mutex and confirming that this thread is
> indeed the owner (just dump the pthread_mutex_t in gdb), and then seeing
> who owns the buf_pool_mutex.

LCM: can you print the contents of the buffer pool mutex? The address
is found in 'mutex=...' for the thread blocked on the buf0buf.ic
mutex. This should have type mutex_t. The text below from the stack
trace is blocked on it.

#3  0x00000000104aa930 in mutex_spin_wait (mutex=0x1089ed60,
file_name=0x106439e8 "../../storage/innobase/include/buf0buf.ic",
line=103) at sync/sync0sync.c:594
	index = 0

>
> LCM: also, could you post your config.h generated during the mysql
> build? I'd just like to confirm which type of mutexes are in use.

It will be good to know whether this uses gcc atomics. Note that
log_sys->mutex and the buffer pool mutex are mutex_t, not the innodb
rw-mutex. The code for this is much simpler than for the rw-mutex.
Alas, it doesn't have as much debugging information in it by default
(such as the file name and line# for the last lock of it). That is
enabled by UNIV_SYNC_DEBUG.

The v3 google patch displays whether gcc atomic are used by Innodb in
either SHOW STATUS or SHOW VARIABLES. Does the Percona build have that
feature?

>
> --
> Innodb locks up under large INSERT load
> https://bugs.launchpad.net/bugs/378100
> You received this bug notification because you are a member of Percona
> developers, which is the registrant for Percona-XtraDB.
>
> Status in Percona XtraDB Storage Engine for MySQL: New
>
> Bug description:
> While trying to load a new db on the 5.1.34-xtradb build, the Innodb engine locks up after a few minutes with multiple table loads running.
>
> I can still log into the server, but I don't get any data back from the SHOW ENGINE INNODB STATUS command. I also can't kill any of the existing INSERT commands. I can reproduce the problem with as few as 3 simulateous loads (INSERT) occurring.
>
> I am running this on SuSE version 10 with 8 processors and 126GB of RAM.
>
> Can I get some help in troubleshooting this? There are no messages being written to the error log.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~percona-discussion
> Post to     : percona-discussion@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~percona-discussion
> More help   : https://help.launchpad.net/ListHelp
>



-- 
Mark Callaghan
mdcallag@xxxxxxxxx



Follow ups

References