kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #131389
[Bug 1486146] [NEW] recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
Public bug reported:
In a multi-threaded pthreads process running on Ubuntu 14.04 AMD64 (with
over 1000 threads) which uses real time FIFO scheduling, we occasionally
see calls to recv() with flags (MSG_PEEK | MSG_WAITALL) get stuck in an
infinte loop or deadlock meaning the threads lock up chewing as much CPU
as they can (due to FIFO scheduling) while stuck inside recv().
Here's an example gdb back trace:
[Switching to thread 4 (Thread 0x7f6040546700 (LWP 27251))]
#0 0x00007f6231d2f7eb in __libc_recv (fd=fd@entry=146, buf=buf@entry=0x7f6040543600, n=n@entry=5, flags=-1, flags@entry=258) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
33 ../sysdeps/unix/sysv/linux/x86_64/recv.c: No such file or directory.
(gdb) bt
#0 0x00007f6231d2f7eb in __libc_recv (fd=fd@entry=146, buf=buf@entry=0x7f6040543600, n=n@entry=5, flags=-1, flags@entry=258) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
#1 0x0000000000421945 in recv (__flags=258, __n=5, __buf=0x7f6040543600, __fd=146) at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]
The socket is a TCP socket in blocking mode, the recv() call is inside
an outer loop with a counter, and I've checked the counter with gdb and
it's always at 1, meaning that I'm sure that the outer loop isn't the
problem, the thread is indeed deadlocked inside the recv() internals.
Other nodes:
* There always seems to be 2 or more threads deadlocked in the same place (same recv() call but with distinct FDs)
* The threads calling recv() have cancellation disbaled by previously executing: thread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);
I've even tried adding a poll() call for POLLRDNORM on the socket before
calling recv() with MSG_PEEK | MSG_WAITALL flags to try to make sure
there's data available on the socket before calling *recv()*, but it
makes no difference.
So, I don't know what is wrong here, I've read all the recv()
documentation and believe that recv() is being used correctly, the only
conclusion I can come to is that there is a bug in libc recv() when
using flags MSG_PEEK | MSG_WAITALL with thousands of pthreads running.
** Affects: linux
Importance: Unknown
Status: Unknown
** Affects: linux (Ubuntu)
Importance: High
Status: Triaged
** Tags: kernel-da-key trusty
** Changed in: linux (Ubuntu)
Importance: Undecided => High
** Changed in: linux (Ubuntu)
Status: New => Triaged
** Bug watch added: Linux Kernel Bug Tracker #99461
http://bugzilla.kernel.org/show_bug.cgi?id=99461
** Also affects: linux via
http://bugzilla.kernel.org/show_bug.cgi?id=99461
Importance: Unknown
Status: Unknown
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1486146
Title:
recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU
(MSG_PEEK|MSG_WAITALL)
Status in Linux:
Unknown
Status in linux package in Ubuntu:
Triaged
Bug description:
In a multi-threaded pthreads process running on Ubuntu 14.04 AMD64
(with over 1000 threads) which uses real time FIFO scheduling, we
occasionally see calls to recv() with flags (MSG_PEEK | MSG_WAITALL)
get stuck in an infinte loop or deadlock meaning the threads lock up
chewing as much CPU as they can (due to FIFO scheduling) while stuck
inside recv().
Here's an example gdb back trace:
[Switching to thread 4 (Thread 0x7f6040546700 (LWP 27251))]
#0 0x00007f6231d2f7eb in __libc_recv (fd=fd@entry=146, buf=buf@entry=0x7f6040543600, n=n@entry=5, flags=-1, flags@entry=258) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
33 ../sysdeps/unix/sysv/linux/x86_64/recv.c: No such file or directory.
(gdb) bt
#0 0x00007f6231d2f7eb in __libc_recv (fd=fd@entry=146, buf=buf@entry=0x7f6040543600, n=n@entry=5, flags=-1, flags@entry=258) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
#1 0x0000000000421945 in recv (__flags=258, __n=5, __buf=0x7f6040543600, __fd=146) at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]
The socket is a TCP socket in blocking mode, the recv() call is inside
an outer loop with a counter, and I've checked the counter with gdb
and it's always at 1, meaning that I'm sure that the outer loop isn't
the problem, the thread is indeed deadlocked inside the recv()
internals.
Other nodes:
* There always seems to be 2 or more threads deadlocked in the same place (same recv() call but with distinct FDs)
* The threads calling recv() have cancellation disbaled by previously executing: thread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);
I've even tried adding a poll() call for POLLRDNORM on the socket
before calling recv() with MSG_PEEK | MSG_WAITALL flags to try to make
sure there's data available on the socket before calling *recv()*, but
it makes no difference.
So, I don't know what is wrong here, I've read all the recv()
documentation and believe that recv() is being used correctly, the
only conclusion I can come to is that there is a bug in libc recv()
when using flags MSG_PEEK | MSG_WAITALL with thousands of pthreads
running.
To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1486146/+subscriptions
Follow ups
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Andy Whitcroft, 2015-10-02
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Launchpad Bug Tracker, 2015-09-28
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Launchpad Bug Tracker, 2015-09-28
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Launchpad Bug Tracker, 2015-09-28
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-09-25
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-09-25
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-09-25
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Mathew Hodson, 2015-09-21
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-09-21
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Brad Figg, 2015-09-13
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Brad Figg, 2015-09-13
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Launchpad Bug Tracker, 2015-09-11
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-08-31
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Andy Whitcroft, 2015-08-26
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Brad Figg, 2015-08-25
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-08-19
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Dan Searle, 2015-08-19
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Joseph Salisbury, 2015-08-19
-
[Bug 1486146] Re: recvfrom SYSCALL infinite loop/deadlock chewing 100% CPU (MSG_PEEK|MSG_WAITALL)
From: Andy Whitcroft, 2015-08-18