kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #31642
[Bug 593635] Re: HDD freezes caused by ata exception that results in soft resetting of link
[Expired for linux (Ubuntu) because there has been no activity for 60
days.]
** Changed in: linux (Ubuntu)
Status: Incomplete => Expired
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/593635
Title:
HDD freezes caused by ata exception that results in soft resetting of
link
Status in “linux” package in Ubuntu:
Expired
Bug description:
Under even moderately heavy disk writes, I am seeing exceptions like the below in my kern.log
-----------------------------------------------
Jun 13 13:33:03 cellar kernel: [66188.434868] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
Jun 13 13:33:03 cellar kernel: [66188.434874] ata4.01: BMDMA stat 0x46
Jun 13 13:33:03 cellar kernel: [66188.434879] ata4.01: failed command: WRITE DMA EXT
Jun 13 13:33:03 cellar kernel: [66188.434886] ata4.01: cmd 35/00:00:00:94:b2/00:04:13:00:00/f0 tag 0 dma 524288 out
Jun 13 13:33:03 cellar kernel: [66188.434888] res 51/84:01:ff:95:b2/84:02:13:00:00/f0 Emask 0x30 (host bus error)
Jun 13 13:33:03 cellar kernel: [66188.434892] ata4.01: status: { DRDY ERR }
Jun 13 13:33:03 cellar kernel: [66188.434895] ata4.01: error: { ICRC ABRT }
Jun 13 13:33:03 cellar kernel: [66188.434907] ata4: soft resetting link
Jun 13 13:33:03 cellar kernel: [66188.622000] ata4.01: configured for UDMA/100
Jun 13 13:33:03 cellar kernel: [66188.622013] ata4: EH complete
----------------------------------------------
This is with the latest stable lucid kernel (2.6.32-22-generic
#36-Ubuntu).
I've also tried a mainline kernel (2.6.35-020635rc1) & still get the
same errors except that there's an additional stack trace:
-----------------------------------------------
Jun 14 18:55:40 cellar kernel: [ 152.874172] irq 19: nobody cared (try booting with the "irqpoll" option)
Jun 14 18:55:40 cellar kernel: [ 152.874182] Pid: 0, comm: swapper Tainted: P 2.6.35-020635rc1-generic #020635rc1
Jun 14 18:55:40 cellar kernel: [ 152.874185] Call Trace:
Jun 14 18:55:40 cellar kernel: [ 152.874198] [<c01a58cc>] __report_bad_irq+0x2c/0x90
Jun 14 18:55:40 cellar kernel: [ 152.874204] [<c016fee3>] ? sched_clock_tick+0x73/0xa0
Jun 14 18:55:40 cellar kernel: [ 152.874209] [<c01a5a44>] note_interrupt+0xe4/0x120
Jun 14 18:55:40 cellar kernel: [ 152.874214] [<c0179da0>] ? tick_nohz_update_jiffies+0x60/0x70
Jun 14 18:55:40 cellar kernel: [ 152.874219] [<c01a6364>] handle_fasteoi_irq+0x84/0xe0
Jun 14 18:55:40 cellar kernel: [ 152.874224] [<c0104abf>] handle_irq+0x1f/0x30
Jun 14 18:55:40 cellar kernel: [ 152.874230] [<c05afefb>] do_IRQ+0x4b/0xc0
Jun 14 18:55:40 cellar kernel: [ 152.874234] [<c01032f0>] common_interrupt+0x30/0x40
Jun 14 18:55:40 cellar kernel: [ 152.874239] [<c010a3a7>] ? mwait_idle+0x57/0xa0
Jun 14 18:55:40 cellar kernel: [ 152.874243] [<c010189c>] cpu_idle+0x8c/0xc0
Jun 14 18:55:40 cellar kernel: [ 152.874249] [<c05a4337>] start_secondary+0xf7/0x130
Jun 14 18:55:40 cellar kernel: [ 152.874252] handlers:
Jun 14 18:55:40 cellar kernel: [ 152.874254] [<c0431060>] (ata_bmdma_interrupt+0x0/0x190)
Jun 14 18:55:40 cellar kernel: [ 152.874261] [<c044fb10>] (usb_hcd_irq+0x0/0x90)
Jun 14 18:55:40 cellar kernel: [ 152.874268] Disabling IRQ #19
Jun 14 18:56:09 cellar kernel: [ 181.856015] ata4: lost interrupt (Status 0x51)
Jun 14 18:56:09 cellar kernel: [ 181.856034] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jun 14 18:56:09 cellar kernel: [ 181.856039] ata4.01: BMDMA stat 0x46, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0
Jun 14 18:56:09 cellar kernel: [ 181.856045] ata4.01: failed command: WRITE DMA EXT
Jun 14 18:56:09 cellar kernel: [ 181.856053] ata4.01: cmd 35/00:00:00:84:08/00:04:3b:00:00/f0 tag 0 dma 524288 out
Jun 14 18:56:09 cellar kernel: [ 181.856054] res 40/00:00:00:4f:c2/00:00:00:00:00/50 Emask 0x24 (host bus error)
Jun 14 18:56:09 cellar kernel: [ 181.856058] ata4.01: status: { DRDY }
Jun 14 18:56:09 cellar kernel: [ 181.856072] ata4: soft resetting link
Jun 14 18:56:09 cellar kernel: [ 182.160065] ata4.01: configured for UDMA/133
Jun 14 18:56:09 cellar kernel: [ 182.160072] ata4.01: device reported invalid CHS sector 0
Jun 14 18:56:09 cellar kernel: [ 182.160080] ata4: EH complete
--------------------------------------------------------------------
I've tried booting with "libata.force=noncq" on both kernels (lucid
stable & 2.6.35 mainline) but makes no difference.
I didn't see these errors in Jaunty. I think they started sometime in
Karmic. I upgraded to Lucid in the hopes that the newer release fixed
it but no difference.
I think I've ruled out HDD failure. I get these errors on 2 old (3+
years) Seagate 7200.10 disks as well as a brand new Seagate 7200.12
disk.
There are similar bug reports in launchpad but one difference that I
noticed is that I consistently see the message "failed command: WRITE
DMA EXT" while the other reports fail during a read or some other
command.
I can very reliably reproduce the errors by running a rdiff-backup
'restore' operation from an external USB HDD.
== Steps to reproduce ==
1. Boot into Gnome & login
2. Run 'tail -f /var/log/kern.log' in one terminal window
3. Run 'rdiff-backup --force -r now /media/freeagent/share /share/' in another terminal
Within a few seconds, I can see the errors show up in the kernel logs.
Running a fast torrent download will do the trick too.
Since I can reproduce the problem so easily, I'll be very willing to
try any special kernel builds to help solve this one.
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-22-generic 2.6.32-22.36
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic i686
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC0: antrix 1387 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xf9ffc000 irq 16'
Mixer name : 'Realtek ALC662 rev1'
Components : 'HDA:10ec0662,15650000,00100101'
Controls : 36
Simple ctrls : 19
Date: Mon Jun 14 19:23:00 2010
HibernationDevice: RESUME=UUID=c6dab799-13a8-443e-b2a3-4b93f3bbb42e
IwConfig:
lo no wireless extensions.
eth0 no wireless extensions.
MachineType: BIOSTAR Group G31-M7 TE
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-22-generic root=UUID=466535ad-0b59-4fd0-b18b-ba486150f91a ro quiet splash
ProcEnviron:
PATH=(custom, user)
LANG=en_SG.utf8
SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.34
RfKill:
SourcePackage: linux
dmi.bios.date: 04/10/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 080014
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: G31-M7 TE
dmi.board.vendor: BIOSTAR Group
dmi.chassis.asset.tag: None
dmi.chassis.type: 3
dmi.chassis.vendor: BIOSTAR Group
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr080014:bd04/10/2009:svnBIOSTARGroup:pnG31-M7TE:pvr:rvnBIOSTARGroup:rnG31-M7TE:rvr:cvnBIOSTARGroup:ct3:cvr:
dmi.product.name: G31-M7 TE
dmi.sys.vendor: BIOSTAR Group
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/593635/+subscriptions