← Back to team overview

kernel-packages team mailing list archive

[Bug 1394032] Re: Trusty isci module doesn't handle timeouts properly

 

After a night of stress testing with the 3.18-rc5 packages I have not
been able to reproduce an unhandled SAS/SCSI event (the new kernel did
intensely dislike NTP and apparmour, but that's an other issue
altogether).

There were only two 'sas_scsi_recover_host' events early in the boot
process: one during initializing the controller and once after what
looks like a PCI bus scan (I've attached the full dmesg output).

So far it seems that the new libsas and isci driver/firmware is much
better at handling the Intel C602 in my chassis.  Is it possible to
backport libsas and isci from mainline to trusty?

** Attachment added: "dmesg-c-3CF89A06CD.txt"
   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1394032/+attachment/4264391/+files/dmesg-c-3CF89A06CD.txt

** Changed in: linux (Ubuntu)
       Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1394032

Title:
  Trusty isci module doesn't handle timeouts properly

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  I'm currently running linux 3.13.0-39 on trusty with a disks plugged
  into an Intel C602 SATA/SAS controller.  Occasionally, a timeout
  and/or SAS event (I'm not 100% sure which..) isn't handled properly
  ('Unhandled error code') and  the kernel gets a bit upset.

  I have 12 different hosts with this controller and disk combination
  and all display the same behaviour (dmesg output):

  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] command ffff8808434fa600 timed out
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] command ffff880843673d00 timed out
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] command ffff88105081bc00 timed out
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] command ffff88084378e100 timed out
  [Tue Nov 18 16:56:10 2014] sas: Enter sas_scsi_recover_host busy: 4 failed: 4
  [Tue Nov 18 16:56:10 2014] sas: ata7: end_device-7:0: cmd error handler
  [Tue Nov 18 16:56:10 2014] sas: ata7: end_device-7:0: dev error handler
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] Unhandled error code
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg]
  [Tue Nov 18 16:56:10 2014] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] CDB:
  [Tue Nov 18 16:56:10 2014] Write(10): 2a 00 04 9e 77 60 00 00 08 00
  [Tue Nov 18 16:56:10 2014] end_request: I/O error, dev sdg, sector 77494112
  [Tue Nov 18 16:56:10 2014] EXT4-fs warning (device dm-2): ext4_end_bio:317: I/O error -5 writing to inode 261733 (offset 0 size 0 starting block 5061868)
  [Tue Nov 18 16:56:10 2014] Buffer I/O error on device dm-2, logical block 5061868
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] Unhandled error code
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg]
  [Tue Nov 18 16:56:10 2014] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] CDB:
  [Tue Nov 18 16:56:10 2014] Write(10): 2a 00 04 0f d0 e0 00 00 08 00
  [Tue Nov 18 16:56:10 2014] end_request: I/O error, dev sdg, sector 68145376
  [Tue Nov 18 16:56:10 2014] EXT4-fs warning (device dm-2): ext4_end_bio:317: I/O error -5 writing to inode 261710 (offset 0 size 0 starting block 3893276)
  [Tue Nov 18 16:56:10 2014] Buffer I/O error on device dm-2, logical block 3893276
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] Unhandled error code
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg]
  [Tue Nov 18 16:56:10 2014] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] CDB:
  [Tue Nov 18 16:56:10 2014] Write(10): 2a 00 02 b8 a1 f8 00 00 08 00
  [Tue Nov 18 16:56:10 2014] end_request: I/O error, dev sdg, sector 45654520
  [Tue Nov 18 16:56:10 2014] Buffer I/O error on device dm-2, logical block 1081919
  [Tue Nov 18 16:56:10 2014] lost page write due to I/O error on dm-2
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] Unhandled error code
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg]
  [Tue Nov 18 16:56:10 2014] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
  [Tue Nov 18 16:56:10 2014] sd 7:0:0:0: [sdg] CDB:
  [Tue Nov 18 16:56:10 2014] Write(10): 2a 00 02 b8 a1 58 00 00 08 00
  [Tue Nov 18 16:56:10 2014] end_request: I/O error, dev sdg, sector 45654360
  [Tue Nov 18 16:56:10 2014] Buffer I/O error on device dm-2, logical block 1081899
  [Tue Nov 18 16:56:10 2014] lost page write due to I/O error on dm-2
  [Tue Nov 18 16:56:10 2014] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
  --- 
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Nov 17 19:42 seq
   crw-rw---- 1 root audio 116, 33 Nov 17 19:42 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.14.1-0ubuntu3.5
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  DistroRelease: Ubuntu 14.04
  HibernationDevice: RESUME=/dev/mapper/root-swap
  IwConfig: Error: [Errno 2] No such file or directory
  Lsusb:
   Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
   Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 001 Device 003: ID 0557:2221 ATEN International Co., Ltd Winbond Hermon
   Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Supermicro X9DRT-PT
  Package: linux (not installed)
  PciMultimedia:
   
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-39-generic root=/dev/mapper/root-root ro console=tty0 console=ttyS1,115200n8 swapaccount=1 quiet splash vt.handoff=7
  ProcVersionSignature: Ubuntu 3.13.0-39.66-generic 3.13.11.8
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-39-generic N/A
   linux-backports-modules-3.13.0-39-generic  N/A
   linux-firmware                             1.127.8
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty
  Uname: Linux 3.13.0-39-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  WifiSyslog:
   
  _MarkForUpload: True
  dmi.bios.date: 05/06/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 3.0b
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: X9DRT-PT
  dmi.board.vendor: Supermicro
  dmi.board.version: 1.01
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 17
  dmi.chassis.vendor: Supermicro
  dmi.chassis.version: 0123456789
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr3.0b:bd05/06/2014:svnSupermicro:pnX9DRT-PT:pvr0123456789:rvnSupermicro:rnX9DRT-PT:rvr1.01:cvnSupermicro:ct17:cvr0123456789:
  dmi.product.name: X9DRT-PT
  dmi.product.version: 0123456789
  dmi.sys.vendor: Supermicro

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1394032/+subscriptions


References