← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1572630] Re: boot-time kernel panic introduced in 4.4.0-18, not present in 4.4.0-15

 

** Changed in: linux-lts-xenial (Ubuntu Xenial)
     Assignee: (unassigned) => Eric Desrochers (slashd)

** Changed in: linux-lts-xenial (Ubuntu Xenial)
       Status: Invalid => In Progress

** Changed in: linux-lts-xenial (Ubuntu Xenial)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1572630

Title:
  boot-time kernel panic introduced in 4.4.0-18, not present in 4.4.0-15

Status in linux package in Ubuntu:
  Fix Released
Status in linux-lts-xenial package in Ubuntu:
  Invalid
Status in linux source package in Trusty:
  Invalid
Status in linux-lts-xenial source package in Trusty:
  In Progress
Status in linux source package in Xenial:
  Fix Committed
Status in linux-lts-xenial source package in Xenial:
  In Progress
Status in linux source package in Yakkety:
  Fix Released
Status in linux-lts-xenial source package in Yakkety:
  Invalid

Bug description:
  [Impact]

  At boot-time, the kernel will panic somewhere in
  'blk_mq_register_disk', a snippet of the track is below, and full
  panic dump is attached. The panic dump was collected via serial
  console, as the kernel panics so early that we cannot kdump it.

  [ 2.650512] [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160
  [ 2.656675] [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160
  [ 2.662661] [<ffffffff813af53e>] add_disk+0x1ce/0x490
  [ 2.667869] [<ffffffff815477e0>] loop_add+0x1f0/0x270

  [Test Case]

  At boot-time, the kernel will panic somewhere in
  'blk_mq_register_disk', a snippet of the track is below, and full
  panic dump is attached.

  [Regression Potential]

   * Fix implemented upstream starting with v4.6-rc1

   * The fix is fairly straightfoward given the stack trace.

   * The fix is hard to verify, but user "Proton" was able to confirmed
  that upstream mainline 4.6-rc1 solve the situation and that the test
  kernel I have provided including the fix solves this particular
  problem as well.

  Confirmation by Proton :
  https://bugs.launchpad.net/ubuntu/+source/linux-lts-xenial/+bug/1572630/comments/23

  [Other Info]

   * https://lkml.org/lkml/2016/3/16/40
   * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e0e827b9
   * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=897bb0c7

  [Original Description]
  We discovered a pretty serious regression introduced in 4.4.0-18.

  At boot-time, the kernel will panic somewhere in
  'blk_mq_register_disk', a snippet of the track is below, and full
  panic dump is attached. The panic dump was collected via serial
  console, as the kernel panics so early that we cannot kdump it.

  [    2.650512]  [<ffffffff813ac8a6>] blk_mq_register_disk+0xa6/0x160
  [    2.656675]  [<ffffffff813a1b44>] blk_register_queue+0xb4/0x160
  [    2.662661]  [<ffffffff813af53e>] add_disk+0x1ce/0x490
  [    2.667869]  [<ffffffff815477e0>] loop_add+0x1f0/0x270

  This seems somewhat similar to https://lkml.org/lkml/2016/3/16/40, but
  the trace is not identical.

  We discovered this issue when we were experimenting with linux-
  generic-lts-xenial from trusty-updates on a 14.04 installation. When
  we installed it, 4.4.0-15 was the current package, and it worked fine
  and provided a large amount of improvements for us. Background
  security updates installed 4.4.0-18, and this updated and grub and
  became the default kernel. On a reboot, the node panics about 2
  seconds in, resulting in a machine in a dead state. We were able to
  boot a rescue image and roll bac kto 4.4.0-15, which works nicely. We
  currently have pinning on 4.4.0-15 to prevent this problem from coming
  back, but would prefer to see the problem fixed.

  I'll attach lspci, lshw, and dmidecode for our hardware as well, but
  this is happening on pretty vanilla supermicro nodes. We are able to
  consistently reproduce it on our hardware. It is not reproducible in
  EC2, only on metal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1572630/+subscriptions