kernel-packages team mailing list archive

Thread
Date

[Bug 1598285] Missing required logs.

To: kernel-packages@xxxxxxxxxxxxxxxxxxx
From: Brad Figg <brad.figg@xxxxxxxxxxxxx>
Date: Fri, 01 Jul 2016 20:00:05 -0000
Reply-to: Bug 1598285 <1598285@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

This bug is missing log files that will aid in diagnosing the problem.
>From a terminal window please run:

apport-collect 1598285

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1598285

Title:
  possible deadlock while using the cgroup freezer on a container with
  NFS-based workload

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Hi guys,

  For background: I'm running a container with an NFS filesystem bind
  mounted into it. The workload I'm running is iozone, a filesystem
  benchmarking tool. While running this workload, I attempt to freeze
  the container, which gets stuck in the FREEZING state. After a while,
  I get:

  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.104156] INFO: task iozone:20035 blocked for more than 120 seconds.
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.111056]       Tainted: P           O    4.4.0-24-generic #43-Ubuntu
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.118053] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126110] iozone          D ffff880015673e18     0 20035  20005 0x00000104
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126116]  ffff880015673e18 ffff880000000010 ffff880045a21b80 ffff880037776e00
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126118]  ffff880015674000 ffff8800179d6e54 ffff880037776e00 00000000ffffffff
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126120]  ffff8800179d6e58 ffff880015673e30 ffffffff81821b15 ffff8800179d6e50
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126121] Call Trace:
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126129]  [<ffffffff81821b15>] schedule+0x35/0x80
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126131]  [<ffffffff81821dbe>] schedule_preempt_disabled+0xe/0x10
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126134]  [<ffffffff818239f9>] __mutex_lock_slowpath+0xb9/0x130
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126136]  [<ffffffff81823a8f>] mutex_lock+0x1f/0x30
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126139]  [<ffffffff8121d00b>] do_unlinkat+0x12b/0x2d0
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126142]  [<ffffffff8121dc16>] SyS_unlink+0x16/0x20
  Jul  1 01:45:14 juju-19f8e3-15 kernel: [206520.126146]  [<ffffffff81825bf2>] entry_SYSCALL_64_fastpath+0x16/0x71

  It looks like the task is actually stuck in generic fs code, not
  anything NFS specific, but perhaps that's a relevant detail. Anyway:

  ubuntu@juju-19f8e3-15:~$ sudo cat /proc/20035/stack
  [<ffffffff8121d00b>] do_unlinkat+0x12b/0x2d0
  [<ffffffff8121dc16>] SyS_unlink+0x16/0x20
  [<ffffffff81825bf2>] entry_SYSCALL_64_fastpath+0x16/0x71
  [<ffffffffffffffff>] 0xffffffffffffffff

  The container and host are both xenial:

  ubuntu@juju-19f8e3-15:~$ uname -a
  Linux juju-19f8e3-15 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

  Finally, I don't have a good reproducer for this. It's pretty rare, as
  I'm running this benchmark in a loop, and over thousands of runs I've
  seen this exactly once.

  I'll leave these hosts up for a bit if there's any other interesting
  bits of info to collect.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1598285/+subscriptions

References

[Bug 1598285] [NEW] possible deadlock while using the cgroup freezer on a container with NFS-based workload
From: Tycho Andersen, 2016-07-01