← Back to team overview

kernel-packages team mailing list archive

[Bug 1348832] Re: e1000e Detected Hardware Unit Hang

 

Hello all,
this happened right now on my server using 3.13.0-39-generic . The NIC was under I/O due to concurring backups.

In my syslog i found many lines of:
Nov 10 23:21:17 server kernel: [759261.907344] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
Nov 10 23:21:17 server kernel: [759261.907344]   TDH                  <de>
Nov 10 23:21:17 server kernel: [759261.907344]   TDT                  <ed>
Nov 10 23:21:17 server kernel: [759261.907344]   next_to_use          <ed>
Nov 10 23:21:17 server kernel: [759261.907344]   next_to_clean        <dc>
Nov 10 23:21:17 server kernel: [759261.907344] buffer_info[next_to_clean]:
Nov 10 23:21:17 server kernel: [759261.907344]   time_stamp           <10b4f2f76>
Nov 10 23:21:17 server kernel: [759261.907344]   next_to_watch        <de>
Nov 10 23:21:17 server kernel: [759261.907344]   jiffies              <10b4f30c5>
Nov 10 23:21:17 server kernel: [759261.907344]   next_to_watch.status <0>
Nov 10 23:21:17 server kernel: [759261.907344] MAC Status             <40080183>
Nov 10 23:21:17 server kernel: [759261.907344] PHY Status             <796d>
Nov 10 23:21:17 server kernel: [759261.907344] PHY 1000BASE-T Status  <3800>
Nov 10 23:21:17 server kernel: [759261.907344] PHY Extended Status    <3000>
Nov 10 23:21:17 server kernel: [759261.907344] PCI Status             <10>


# lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/Ivy Bridge DRAM Controller (rev 09)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 05)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a5)
00:1f.0 ISA bridge: Intel Corporation C204 Chipset Family LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 05)
01:00.0 RAID bus controller: Adaptec Series 6 - 6G SAS/PCIe 2 (rev 01)
02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
03:03.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)

any help appreciated.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1348832

Title:
  e1000e Detected Hardware Unit Hang

Status in “linux” package in Ubuntu:
  Expired

Bug description:
  e1000e device on a trusty router/server/etc, up for a few weeks,
  stopped responding and went into a "Detected Hardware Unit Hang" loop.
  At the time, the interface was under moderate load, about 60mbps
  inbound.

  [1871400.121801] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
  [1871400.121801]   TDH                  <92>
  [1871400.121801]   TDT                  <cf>
  [1871400.121801]   next_to_use          <cf>
  [1871400.121801]   next_to_clean        <90>
  [1871400.121801] buffer_info[next_to_clean]:
  [1871400.121801]   time_stamp           <11be7959f>
  [1871400.121801]   next_to_watch        <92>
  [1871400.121801]   jiffies              <11be79797>
  [1871400.121801]   next_to_watch.status <0>
  [1871400.121801] MAC Status             <80083>
  [1871400.121801] PHY Status             <796d>
  [1871400.121801] PHY 1000BASE-T Status  <7800>
  [1871400.121801] PHY Extended Status    <3000>
  [1871400.121801] PCI Status             <10>
  [1871402.120130] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
  [1871402.120130]   TDH                  <92>
  [1871402.120130]   TDT                  <cf>
  [1871402.120130]   next_to_use          <cf>
  [1871402.120130]   next_to_clean        <90>
  [1871402.120130] buffer_info[next_to_clean]:
  [1871402.120130]   time_stamp           <11be7959f>
  [1871402.120130]   next_to_watch        <92>
  [1871402.120130]   jiffies              <11be7998b>
  [1871402.120130]   next_to_watch.status <0>
  [1871402.120130] MAC Status             <80083>
  [1871402.120130] PHY Status             <796d>
  [1871402.120130] PHY 1000BASE-T Status  <7800>
  [1871402.120130] PHY Extended Status    <3000>
  [1871402.120130] PCI Status             <10>
  [1871404.118513] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
  [1871404.118513]   TDH                  <92>
  [1871404.118513]   TDT                  <cf>
  [1871404.118513]   next_to_use          <cf>
  [1871404.118513]   next_to_clean        <90>
  [1871404.118513] buffer_info[next_to_clean]:
  [1871404.118513]   time_stamp           <11be7959f>
  [1871404.118513]   next_to_watch        <92>
  [1871404.118513]   jiffies              <11be79b7f>
  [1871404.118513]   next_to_watch.status <0>
  [1871404.118513] MAC Status             <80083>
  [1871404.118513] PHY Status             <796d>
  [1871404.118513] PHY 1000BASE-T Status  <7800>
  [1871404.118513] PHY Extended Status    <3000>
  [1871404.118513] PCI Status             <10>
  [1871405.129366] ------------[ cut here ]------------
  [1871405.129374] WARNING: CPU: 0 PID: 0 at /build/buildd/linux-3.13.0/net/sched/sch_generic.c:264 dev_watchdog+0x276/0x280()
  [1871405.129375] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
  [1871405.129376] Modules linked in: ksplice_vijajns4_vmlinux_new(OF) ksplice_vijajns4(OF) ksplice_ax32i68r_vmlinux_new(OF) ksplice_ax32i68r(OF) ksplice_n4y5yx0j_vmlinux_new(OF) ksplice_n4y5yx0j(OF) nct6775(F) hwmon_vid(F) xt_nat vhost_net vhost macvtap macvlan ip6t_REJECT ipt_REJECT xt_LOG ipt_MASQUERADE xt_TCPMSS iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_tcpudp ip6table_filter ip6_tables iptable_filter ip_tables x_tables sit tunnel4 ip_tunnel bridge stp llc nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache jfs dm_multipath scsi_dh x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm ftdi_sio joydev usbserial parport_pc mei_me mei serio_raw ppdev acpi_pad mac_hid coretemp lp parport dm_crypt raid456 async_memcpy async_raid6_recov async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear hid_generic raid10 usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 i915 lrw e1000e gf128mul glue_helper ablk_helper i2c_algo_bit cryptd psmouse ahci drm_kms_helper ptp libahci pps_core video drm [last unloaded: ksplice_vijajns4_vmlinux_old]
  [1871405.129434] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF          O 3.13.0-30-generic #54-Ubuntu
  [1871405.129435] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Pro3, BIOS P1.40 05/23/2014
  [1871405.129436]  0000000000000009 ffff88081fa03d98 ffffffff8171a324 ffff88081fa03de0
  [1871405.129440]  ffff88081fa03dd0 ffffffff810676bd 0000000000000000 ffff8807eeb94000
  [1871405.129441]  ffff8807ee867c80 0000000000000001 0000000000000000 ffff88081fa03e30
  [1871405.129444] Call Trace:
  [1871405.129445]  <IRQ>  [<ffffffff8171a324>] dump_stack+0x45/0x56
  [1871405.129453]  [<ffffffff810676bd>] warn_slowpath_common+0x7d/0xa0
  [1871405.129455]  [<ffffffff8106772c>] warn_slowpath_fmt+0x4c/0x50
  [1871405.129459]  [<ffffffff8163f096>] dev_watchdog+0x276/0x280
  [1871405.129461]  [<ffffffff8163ee20>] ? dev_graft_qdisc+0x80/0x80
  [1871405.129464]  [<ffffffff81074226>] call_timer_fn+0x36/0x100
  [1871405.129467]  [<ffffffff8163ee20>] ? dev_graft_qdisc+0x80/0x80
  [1871405.129468]  [<ffffffff810751bf>] run_timer_softirq+0x1ef/0x2f0
  [1871405.129471]  [<ffffffff8106caec>] __do_softirq+0xec/0x2c0
  [1871405.129475]  [<ffffffff8106d035>] irq_exit+0x105/0x110
  [1871405.129478]  [<ffffffff8172d0c5>] smp_apic_timer_interrupt+0x45/0x60
  [1871405.129482]  [<ffffffff8172ba5d>] apic_timer_interrupt+0x6d/0x80
  [1871405.129483]  <EOI>  [<ffffffff815cd232>] ? cpuidle_enter_state+0x52/0xc0
  [1871405.129488]  [<ffffffff815cd359>] cpuidle_idle_call+0xb9/0x1f0
  [1871405.129491]  [<ffffffff8101ce9e>] arch_cpu_idle+0xe/0x30
  [1871405.129493]  [<ffffffff810beb95>] cpu_startup_entry+0xc5/0x290
  [1871405.129495]  [<ffffffff817087f7>] rest_init+0x77/0x80
  [1871405.129499]  [<ffffffff81d35f70>] start_kernel+0x438/0x443
  [1871405.129502]  [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
  [1871405.129504]  [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
  [1871405.129507]  [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
  [1871405.129509]  [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
  [1871405.129511] ---[ end trace 51d2ab0f0f49c9f7 ]---
  [1871405.129517] e1000e 0000:00:19.0 eth1: Reset adapter unexpectedly
  [1871408.464217] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
  [1872210.451151] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
  [1872210.451151]   TDH                  <d0>
  [1872210.451151]   TDT                  <fc>
  [1872210.451151]   next_to_use          <fc>
  [1872210.451151]   next_to_clean        <ce>
  [1872210.451151] buffer_info[next_to_clean]:
  [1872210.451151]   time_stamp           <11beaad13>
  [1872210.451151]   next_to_watch        <d0>
  [1872210.451151]   jiffies              <11beaaf95>
  [1872210.451151]   next_to_watch.status <0>
  [1872210.451151] MAC Status             <80083>
  [1872210.451151] PHY Status             <796d>
  [1872210.451151] PHY 1000BASE-T Status  <3800>
  [1872210.451151] PHY Extended Status    <3000>
  [1872210.451151] PCI Status             <10>
  [1872212.449706] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
  [1872212.449706]   TDH                  <d0>
  [1872212.449706]   TDT                  <fc>
  [1872212.449706]   next_to_use          <fc>
  [1872212.449706]   next_to_clean        <ce>
  [1872212.449706] buffer_info[next_to_clean]:
  [1872212.449706]   time_stamp           <11beaad13>
  [1872212.449706]   next_to_watch        <d0>
  [1872212.449706]   jiffies              <11beab189>
  [1872212.449706]   next_to_watch.status <0>
  [1872212.449706] MAC Status             <80083>
  [1872212.449706] PHY Status             <796d>
  [1872212.449706] PHY 1000BASE-T Status  <3800>
  [1872212.449706] PHY Extended Status    <3000>
  [1872212.449706] PCI Status             <10>
  [1872213.452369] e1000e 0000:00:19.0 eth1: Reset adapter unexpectedly
  [1872217.718430] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
  [1945526.043907] e1000e: eth1 NIC Link is Down
  [1945543.935226] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
  [1945545.975407] e1000e: eth1 NIC Link is Down
  [1945548.894443] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
  [1958330.855525] e1000e: eth1 NIC Link is Down
  [1958349.624164] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
  [1958351.234642] e1000e: eth1 NIC Link is Down
  [1958354.397490] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
  [1965758.988824] e1000e 0000:00:19.0 eth1: Detected Hardware Unit Hang:
  [1965758.988824]   TDH                  <37>
  [1965758.988824]   TDT                  <8d>
  [1965758.988824]   next_to_use          <8d>
  [1965758.988824]   next_to_clean        <37>
  [1965758.988824] buffer_info[next_to_clean]:
  [1965758.988824]   time_stamp           <11d4fd4d9>
  [1965758.988824]   next_to_watch        <37>
  [1965758.988824]   jiffies              <11d4fd739>
  [1965758.988824]   next_to_watch.status <0>
  [1965758.988824] MAC Status             <80083>
  [1965758.988824] PHY Status             <796d>
  [1965758.988824] PHY 1000BASE-T Status  <3800>
  [1965758.988824] PHY Extended Status    <3000>
  [1965758.988824] PCI Status             <10>

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-3.13.0-30-generic 3.13.0-30.55
  ProcVersionSignature: Ubuntu 3.13.0-30.55-generic 3.13.11.2
  Uname: Linux 3.13.0-30-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jul 25 14:36 seq
   crw-rw---- 1 root audio 116, 33 Jul 25 14:36 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.14.1-0ubuntu3.2
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory: 'iw'
  Date: Fri Jul 25 15:27:00 2014
  HibernationDevice: RESUME=/dev/mapper/nibbler-swap_1
  InstallationDate: Installed on 2012-03-03 (874 days ago)
  InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Beta amd64 (20120229)
  MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M.
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-30-generic root=/dev/mapper/hostname-root ro nomdmonddf nomdmonisw nomdmonddf nomdmonisw
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-30-generic N/A
   linux-backports-modules-3.13.0-30-generic  N/A
   linux-firmware                             1.127.4
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: Upgraded to trusty on 2014-03-11 (136 days ago)
  dmi.bios.date: 05/23/2014
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: P1.40
  dmi.board.name: Z97 Pro3
  dmi.board.vendor: ASRock
  dmi.chassis.asset.tag: To Be Filled By O.E.M.
  dmi.chassis.type: 3
  dmi.chassis.vendor: To Be Filled By O.E.M.
  dmi.chassis.version: To Be Filled By O.E.M.
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrP1.40:bd05/23/2014:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASRock:rnZ97Pro3:rvr:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
  dmi.product.name: To Be Filled By O.E.M.
  dmi.product.version: To Be Filled By O.E.M.
  dmi.sys.vendor: To Be Filled By O.E.M.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1348832/+subscriptions


References