← Back to team overview

kernel-packages team mailing list archive

[Bug 1276705] Re: Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)

 

Pretty sure that's it. -12 is -ENOMEM, and there are two sites in that
commit that return -ENOMEM, right when we have an error message from the
failure to spawn the SCSI error handler thread.

Note that the oops/backtrace is a red herring. There is a secondary,
unrelated bug in the mptsas code that is triggered by this untested
codepath: when scsi_host_alloc fails to allocate (and hence eventually
set ioc->sh), "Unable to register controller with SCSI subsystem" is
printed (which we see), then it jumps to out_mptsas_probe where
mptscsih_remove() is called; however, mptscsih_remove() tries to
scsi_remove_host(host), but host = ioc->sh & ioc->sh == NULL, as it was
the reason we ended up here. The solution for this -again, unrelated-
bug would be to have a different label for early failures that won't
call mptscsih_remove(). I'll prepare a patch for this and submit it to
linux-scsi shortly.

The real issue of this bug is why the kthread spawning fails; I haven't
figured that out yet.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1276705

Title:
  Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  We have recently upgraded an Dell R300 server to Trusty (was running
  fine in precise), and after upgrade it fail to boot.

  It is an issue with the SAS controller during the initilisation. It
  fail to detect the disk, we have the following error in console log:

  [   36.539955] scsi4: error handler thread failed to spawn, error = -12
  [   36.552694] mptsas: ioc0: WARNING - Unable to register controller with SCSI subsystem

  After this error, initramfs drop to a shell complaining that rootfs is
  not found. No disk is seen at all (cat /proc/partition only show sr0 -
  cdrom drive).

  We have this issue with two different server (both R300, both Dell SAS
  6/iR controller and same hardware).

  We don't have this issue with another Dell server (R310, Dell PERC
  H200).

  We also tester with old kernel (generic, 3.2.0-58.88), it is working.

  Those server need a greater rootdelay (probably #579572), so we have
  rootdelay=45. If we remove rootdelay=45, then disk are correctly
  recognized ! (but few second too late, initramfs dropped to a shell.
  Pressing control-D resume normal boot)

  So the issue is that with the (mandatory) rootdelay greater that 30
  (default value I think), the disk are not detected due to the error
  shown above. This is a regression since those server worked in precise
  (and work with precise old kernel).

  
  System information

  * Dell R300 with Dell SAS 6/iR controller
  * Ubuntu Trusty Tahr (14.04)
  * Running arch: x86_64
  * Kernel version: 3.13.0-7-generic  (dpkg version : 3.13.0-7.25)
  * Kernel command line: BOOT_IMAGE=/vmlinuz-3.13.0-7-generic root=UUID=174e14b5-46fc-479b-9f94-05cb33c75ac9 ro rootdelay=45 console=tty0 console=ttyS1,57600 quiet
  * uname -a: Linux frtls-perf01 3.13.0-7-generic #25-Ubuntu SMP Tue Feb 4 10:19:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

  
  Attached files:

  * console output when error occure.
  * dmesg when system boot (no rootdelay, need to press control-d during initramfs boot)
  * lspci -vnn

  
  Tell me if you need more informations.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705/+subscriptions


References