← Back to team overview

kernel-packages team mailing list archive

[Bug 1395269] Re: [e1000e] ethtool -t eth0 offline loses routing table

 

So the idea is for drivers to not tell the kernel that the interface
went down, while it's doing self-tests?  I guess igb had this problem
fixed, according to the redhat bug, but I guess not e1000e.

 Yes, I'm pretty sure the interface goes down during the offline portion
of the full set of self-tests, for my e1000e.  Connected to my switch,
it takes longer than usual to autonegotiate a link.  I should have
posted this in the initial report, but here's the actual output:

sudo ethtool -t eth0 offline
The test result is FAIL
The test extra info:
Register test  (offline)         0
Eeprom test    (offline)         0
Interrupt test (offline)         0
Loopback test  (offline)         0
Link test   (on/offline)         1

 If this doesn't usually happen with e1000e, the long autonegotiation is
probably the corner case that's causing it.  It's so long that the link
test fails.  (also, would it make sense to do the link test first,
before offline tests that trigger autonegotiation?  Or do we WANT to
flag problems like sketchy setups that require SmartSpeed fallback to
10baseT to make a working link?)


 The other solution would be to save/restore routing table entries for that interface.  But that might cause problems in some corner cases.  So it might be a lot of work to implement safely, in the face of complex routing tables and/or changes made during the self-test while the interface was still online.  Oh duh, nvm, there's more than just IPv4 to save/restore routing tables for.  Some custom protocol that ethtool doesn't know about would not have its routing table saved/restored.

 Anyway, thanks for having a look into this.  It's not a problem for me
now that I know about it, just wanted to get it reported so at least the
docs could include a warning.  That's all that I think really needs
doing, since checking every driver would be a lot of work.


how about:
 ethtool(8):
...
       offline
              Perform all tests, including ones that interrupt normal operation.  Some drivers may bring the interface down/up during this process, flushing routing table entries.  They shouldn't, but be prepared just in case.  Report problems with specific drivers against the Linux kernel (not ethtool).


 The "report a bug" sentence is probably too much, and could go.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1395269

Title:
  [e1000e] ethtool -t eth0 offline loses routing table

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  ethtool -t eth0 offline does the tests, but leaves the routing table
  with only the entry for the local network.  I had to sudo route add
  default gw 10.0.0.1, in my case.  The online test didn't do this.

  Ubuntu 14.04, ethtool 1:3.13-1

  Linux tesla 3.13.0-39-generic #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC
  2014 x86_64 x86_64 x86_64 GNU/Linux

  ethtool -i eth0: 
  driver: e1000e
  version: 2.3.2-k
  firmware-version: 1.1-0
  bus-info: 0000:00:19.0

  relevant kernel log:
  [637008.472410] e1000e 0000:00:19.0 eth0: offline testing starting
  [637009.077985] e1000e 0000:00:19.0 eth0: testing unshared interrupt
  [637022.468941] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
  [637022.572094] e1000e 0000:00:19.0: irq 45 for MSI/MSI-X
  [637022.572257] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
  [637037.432893] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx
  [637037.433003] e1000e 0000:00:19.0 eth0: Link Speed was downgraded by SmartSpeed
  [637037.433005] e1000e 0000:00:19.0 eth0: 10/100 speed: disabling TSO
  [637037.433035] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
  [637037.982611] net_ratelimit: 3 callbacks suppressed
  [637037.982623] IPv4: martian source 10.0.0.17 from 80.73.161.44, on dev eth0
  [637037.982628] ll header: 00000000: 00 19 d1 11 b4 9b 00 03 6d 11 34 1b 08 00        ........m.4...

  (the martian packets are from TCP connections that my router is still
  NATing to this machine, even though without its routing table, it's
  not happy to see them.)

   And yes, my e1000e is autonegotiating to 10baseT/Full on the same
  cables and switch that still works at 1000baseT with another machine,
  hence running self-tests...  I thought this machine used to run at
  1000baseT, weird if I went 5 years without noticing my desktop being
  slow.  Not what this bug report is about, though.

   The e1000e hardware is on a DG965WH Intel mobo (ICH8 / g965 graphics, first-gen core2)
  00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02)
          Subsystem: Intel Corporation Device 0001
          Flags: bus master, fast devsel, latency 0, IRQ 45
          Memory at e0300000 (32-bit, non-prefetchable) [size=128K]
          Memory at e0324000 (32-bit, non-prefetchable) [size=4K]
          I/O ports at 20e0 [size=32]
          Capabilities: [c8] Power Management version 2
          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
          Kernel driver in use: e1000e

  
  $ ethtool eth0
  Settings for eth0:
          Supported ports: [ TP ]
          Supported link modes:   10baseT/Half 10baseT/Full 
                                  100baseT/Half 100baseT/Full 
                                  1000baseT/Full 
          Supported pause frame use: No
          Supports auto-negotiation: Yes
          Advertised link modes:  10baseT/Full 
                                  100baseT/Full 
                                  1000baseT/Full 
          Advertised pause frame use: No
          Advertised auto-negotiation: Yes
          Speed: 10Mb/s
          Duplex: Full
          Port: Twisted Pair
          PHYAD: 1
          Transceiver: internal
          Auto-negotiation: on
          MDI-X: on (auto)
  Cannot get wake-on-lan settings: Operation not permitted
          Current message level: 0x00000007 (7)
                                 drv probe link
          Link detected: yes

  
  Appears to be the same problem as someone reported to Redhat a while ago, which got marked as fixed for the igb driver
  https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=661976
  Not very useful info in their BTS, because the bug that it's a dup of is now flagged private, so nobody can even look at it.

   Possibly this is a per-driver thing, unless the right fix is to have
  ethtool save/restore the routing table entries for that iface.

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: ethtool 1:3.13-1
  ProcVersionSignature: Ubuntu 3.13.0-39.66-generic 3.13.11.8
  Uname: Linux 3.13.0-39-generic x86_64
  ApportVersion: 2.14.1-0ubuntu3.5
  Architecture: amd64
  Date: Sat Nov 22 03:00:35 2014
  Dependencies:
   gcc-4.9-base 4.9.1-0ubuntu1
   libc6 2.19-0ubuntu6.3
   libgcc1 1:4.9.1-0ubuntu1
   multiarch-support 2.19-0ubuntu6.3
  SourcePackage: ethtool
  UpgradeStatus: Upgraded to trusty on 2014-07-14 (130 days ago)
  --- 
  ApportVersion: 2.14.1-0ubuntu3.5
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC0:  peter      2715 F.... pulseaudio
   /dev/snd/pcmC0D0p:   peter      2715 F...m pulseaudio
  CRDA: Error: [Errno 2] No such file or directory
  DistroRelease: Ubuntu 14.04
  HibernationDevice: RESUME=UUID=0da07ae0-ff5a-43c6-9702-519aff370fd5
  IwConfig: Error: [Errno 2] No such file or directory
  Package: linux (not installed)
  ProcFB: 0 inteldrmfb
  ProcKernelCmdLine: root=LABEL=t-root2 ro
  ProcVersionSignature: Ubuntu 3.13.0-39.66-generic 3.13.11.8
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-39-generic N/A
   linux-backports-modules-3.13.0-39-generic  N/A
   linux-firmware                             1.127.4
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty
  Uname: Linux 3.13.0-39-generic x86_64
  UpgradeStatus: Upgraded to trusty on 2014-07-14 (131 days ago)
  UserGroups: adm admin audio cdrom dialout dip floppy fuse lpadmin plugdev sambashare scanner src staff users vboxusers video
  _MarkForUpload: True
  dmi.bios.date: 11/17/2008
  dmi.bios.vendor: Intel Corp.
  dmi.bios.version: MQ96510J.86A.1754.2008.1117.0002
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: DG965WH
  dmi.board.vendor: Intel Corporation
  dmi.board.version: AAD41692-304
  dmi.chassis.type: 3
  dmi.modalias: dmi:bvnIntelCorp.:bvrMQ96510J.86A.1754.2008.1117.0002:bd11/17/2008:svn:pn:pvr:rvnIntelCorporation:rnDG965WH:rvrAAD41692-304:cvn:ct3:cvr:

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1395269/+subscriptions