kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #18426
[Bug 593635] Re: HDD freezes caused by ata exception that results in soft resetting of link
Deepak Sarda, this bug was reported a while ago and there hasn't been
any activity in it recently. We were wondering if this is still an
issue? If so, could you please test for this with the latest development
release of Ubuntu? ISO images are available from
http://cdimage.ubuntu.com/daily-live/current/ .
If it remains an issue, could you please run the following command in
the development release from a Terminal
(Applications->Accessories->Terminal), as it will automatically gather
and attach updated debug information to this report:
apport-collect -p linux <replace-with-bug-number>
Also, could you please test the latest upstream kernel available following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Please do not test the daily kernel folder, but the one all the way at the bottom. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER
where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.12-rc2
This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing
If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER
As well, please remove the tag:
needs-upstream-testing
Once testing of the upstream kernel is complete, please mark this bug's
Status as Confirmed. Please let us know your results. Thank you for your
understanding.
** Changed in: linux (Ubuntu)
Status: Triaged => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/593635
Title:
HDD freezes caused by ata exception that results in soft resetting of
link
Status in “linux” package in Ubuntu:
Incomplete
Bug description:
Under even moderately heavy disk writes, I am seeing exceptions like the below in my kern.log
-----------------------------------------------
Jun 13 13:33:03 cellar kernel: [66188.434868] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
Jun 13 13:33:03 cellar kernel: [66188.434874] ata4.01: BMDMA stat 0x46
Jun 13 13:33:03 cellar kernel: [66188.434879] ata4.01: failed command: WRITE DMA EXT
Jun 13 13:33:03 cellar kernel: [66188.434886] ata4.01: cmd 35/00:00:00:94:b2/00:04:13:00:00/f0 tag 0 dma 524288 out
Jun 13 13:33:03 cellar kernel: [66188.434888] res 51/84:01:ff:95:b2/84:02:13:00:00/f0 Emask 0x30 (host bus error)
Jun 13 13:33:03 cellar kernel: [66188.434892] ata4.01: status: { DRDY ERR }
Jun 13 13:33:03 cellar kernel: [66188.434895] ata4.01: error: { ICRC ABRT }
Jun 13 13:33:03 cellar kernel: [66188.434907] ata4: soft resetting link
Jun 13 13:33:03 cellar kernel: [66188.622000] ata4.01: configured for UDMA/100
Jun 13 13:33:03 cellar kernel: [66188.622013] ata4: EH complete
----------------------------------------------
This is with the latest stable lucid kernel (2.6.32-22-generic
#36-Ubuntu).
I've also tried a mainline kernel (2.6.35-020635rc1) & still get the
same errors except that there's an additional stack trace:
-----------------------------------------------
Jun 14 18:55:40 cellar kernel: [ 152.874172] irq 19: nobody cared (try booting with the "irqpoll" option)
Jun 14 18:55:40 cellar kernel: [ 152.874182] Pid: 0, comm: swapper Tainted: P 2.6.35-020635rc1-generic #020635rc1
Jun 14 18:55:40 cellar kernel: [ 152.874185] Call Trace:
Jun 14 18:55:40 cellar kernel: [ 152.874198] [<c01a58cc>] __report_bad_irq+0x2c/0x90
Jun 14 18:55:40 cellar kernel: [ 152.874204] [<c016fee3>] ? sched_clock_tick+0x73/0xa0
Jun 14 18:55:40 cellar kernel: [ 152.874209] [<c01a5a44>] note_interrupt+0xe4/0x120
Jun 14 18:55:40 cellar kernel: [ 152.874214] [<c0179da0>] ? tick_nohz_update_jiffies+0x60/0x70
Jun 14 18:55:40 cellar kernel: [ 152.874219] [<c01a6364>] handle_fasteoi_irq+0x84/0xe0
Jun 14 18:55:40 cellar kernel: [ 152.874224] [<c0104abf>] handle_irq+0x1f/0x30
Jun 14 18:55:40 cellar kernel: [ 152.874230] [<c05afefb>] do_IRQ+0x4b/0xc0
Jun 14 18:55:40 cellar kernel: [ 152.874234] [<c01032f0>] common_interrupt+0x30/0x40
Jun 14 18:55:40 cellar kernel: [ 152.874239] [<c010a3a7>] ? mwait_idle+0x57/0xa0
Jun 14 18:55:40 cellar kernel: [ 152.874243] [<c010189c>] cpu_idle+0x8c/0xc0
Jun 14 18:55:40 cellar kernel: [ 152.874249] [<c05a4337>] start_secondary+0xf7/0x130
Jun 14 18:55:40 cellar kernel: [ 152.874252] handlers:
Jun 14 18:55:40 cellar kernel: [ 152.874254] [<c0431060>] (ata_bmdma_interrupt+0x0/0x190)
Jun 14 18:55:40 cellar kernel: [ 152.874261] [<c044fb10>] (usb_hcd_irq+0x0/0x90)
Jun 14 18:55:40 cellar kernel: [ 152.874268] Disabling IRQ #19
Jun 14 18:56:09 cellar kernel: [ 181.856015] ata4: lost interrupt (Status 0x51)
Jun 14 18:56:09 cellar kernel: [ 181.856034] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jun 14 18:56:09 cellar kernel: [ 181.856039] ata4.01: BMDMA stat 0x46, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0, BMDMA stat 0x0
Jun 14 18:56:09 cellar kernel: [ 181.856045] ata4.01: failed command: WRITE DMA EXT
Jun 14 18:56:09 cellar kernel: [ 181.856053] ata4.01: cmd 35/00:00:00:84:08/00:04:3b:00:00/f0 tag 0 dma 524288 out
Jun 14 18:56:09 cellar kernel: [ 181.856054] res 40/00:00:00:4f:c2/00:00:00:00:00/50 Emask 0x24 (host bus error)
Jun 14 18:56:09 cellar kernel: [ 181.856058] ata4.01: status: { DRDY }
Jun 14 18:56:09 cellar kernel: [ 181.856072] ata4: soft resetting link
Jun 14 18:56:09 cellar kernel: [ 182.160065] ata4.01: configured for UDMA/133
Jun 14 18:56:09 cellar kernel: [ 182.160072] ata4.01: device reported invalid CHS sector 0
Jun 14 18:56:09 cellar kernel: [ 182.160080] ata4: EH complete
--------------------------------------------------------------------
I've tried booting with "libata.force=noncq" on both kernels (lucid
stable & 2.6.35 mainline) but makes no difference.
I didn't see these errors in Jaunty. I think they started sometime in
Karmic. I upgraded to Lucid in the hopes that the newer release fixed
it but no difference.
I think I've ruled out HDD failure. I get these errors on 2 old (3+
years) Seagate 7200.10 disks as well as a brand new Seagate 7200.12
disk.
There are similar bug reports in launchpad but one difference that I
noticed is that I consistently see the message "failed command: WRITE
DMA EXT" while the other reports fail during a read or some other
command.
I can very reliably reproduce the errors by running a rdiff-backup
'restore' operation from an external USB HDD.
== Steps to reproduce ==
1. Boot into Gnome & login
2. Run 'tail -f /var/log/kern.log' in one terminal window
3. Run 'rdiff-backup --force -r now /media/freeagent/share /share/' in another terminal
Within a few seconds, I can see the errors show up in the kernel logs.
Running a fast torrent download will do the trick too.
Since I can reproduce the problem so easily, I'll be very willing to
try any special kernel builds to help solve this one.
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-22-generic 2.6.32-22.36
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic i686
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC0: antrix 1387 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'Intel'/'HDA Intel at 0xf9ffc000 irq 16'
Mixer name : 'Realtek ALC662 rev1'
Components : 'HDA:10ec0662,15650000,00100101'
Controls : 36
Simple ctrls : 19
Date: Mon Jun 14 19:23:00 2010
HibernationDevice: RESUME=UUID=c6dab799-13a8-443e-b2a3-4b93f3bbb42e
IwConfig:
lo no wireless extensions.
eth0 no wireless extensions.
MachineType: BIOSTAR Group G31-M7 TE
ProcCmdLine: BOOT_IMAGE=/vmlinuz-2.6.32-22-generic root=UUID=466535ad-0b59-4fd0-b18b-ba486150f91a ro quiet splash
ProcEnviron:
PATH=(custom, user)
LANG=en_SG.utf8
SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.34
RfKill:
SourcePackage: linux
dmi.bios.date: 04/10/2009
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 080014
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: G31-M7 TE
dmi.board.vendor: BIOSTAR Group
dmi.chassis.asset.tag: None
dmi.chassis.type: 3
dmi.chassis.vendor: BIOSTAR Group
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr080014:bd04/10/2009:svnBIOSTARGroup:pnG31-M7TE:pvr:rvnBIOSTARGroup:rnG31-M7TE:rvr:cvnBIOSTARGroup:ct3:cvr:
dmi.product.name: G31-M7 TE
dmi.sys.vendor: BIOSTAR Group
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/593635/+subscriptions