← Back to team overview

kernel-packages team mailing list archive

[Bug 1508706] Re: Networking hangs on azure using hv_netvsc; bisected

 

I have tested the patch referenced in comment #5 and it appears to
resolve the network hang.

I first built and tested the Ubuntu LTS 3.19.0-31.36~14.04.1 kernel and
reproduced the issue using the methodology described in the original bug
description.  This is commit

commit 15e42c329445b4e0f0aecefc39e205c44755c2ba
Author: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
Date:   Thu Oct 8 10:26:57 2015 +0100

    UBUNTU: Ubuntu-lts-3.19.0-31.36~14.04.1

in the lts-backport-vivid branch of git://kernel.ubuntu.com/ubuntu
/ubuntu-trusty.git

I then applied the referenced patch and tested again and was unable to
reproduce the issue after roughly an hour of testing.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1508706

Title:
  Networking hangs on azure using hv_netvsc; bisected

Status in linux package in Ubuntu:
  Triaged

Bug description:
  
  Running Ubuntu instances on azure, testing basic networking between two instances.  This involves configuring VXLAN between the two instances and running iperf and rsync of the kernel tree between the instances, e.g.,

  ip link add vxlan0 type vxlan id 999 local 10.88.0.12 remote 10.88.0.11 dev eth0
  ip l set vxlan0 up
  ip addr add 242.0.0.12/8 dev vxlan0

  After some time (sometimes instantly, sometimes up to 30 minutes of
  activity), the networking will hang.  This hang takes two forms:  a
  complete loss of connectivity (all network, even the ssh session used
  to log in), or just a loss of connectivity between instances (the ssh
  session remains active).  Sometimes for the latter case, the ssh
  session will then later hang.

  This first appeared when testing with the Ubuntu 3.19 kernel, and I
  subsequently bisected this to:

  commit effa2012d207f78cbc5a8360e62d420a8860b7e9
  Author: KY Srinivasan <kys@xxxxxxxxxxxxx>
  Date:   Mon May 11 15:39:46 2015 -0700

      hv_netvsc: Use the xmit_more skb flag to optimize signaling the
  host

      BugLink: http://bugs.launchpad.net/bugs/1454892

      Based on the information given to this driver (via the xmit_more skb flag),
      we can defer signaling the host if more packets are on the way. This will help
      make the host more efficient since it can potentially process a larger batch of
      packets. Implement this optimization.

      Signed-off-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
      Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
      Acked-by: Tim Gardner <tim.gardner@xxxxxxxxxxxxx>
      Acked-by: Brad Figg <brad.figg@xxxxxxxxxxxxx>
      Signed-off-by: Brad Figg <brad.figg@xxxxxxxxxxxxx>

  I also tested the mainline kernel (net-next); it fails with the
  equivalent commit:

  commit 82fa3c776e5abba7ed6e4b4f4983d14731c37d6a
  Author: KY Srinivasan <kys@xxxxxxxxxxxxx>
  Date:   Mon May 11 15:39:46 2015 -0700

      hv_netvsc: Use the xmit_more skb flag to optimize signaling the
  host

  For both kernel trees, I also tested the prior commit and it did not
  exhibit the failure after many hours.  For ubuntu, this was

  commit a4aeb290bd75af5e16a6144a418291476ac6140c
  Author: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
  Date:   Wed Mar 18 12:29:29 2015 -0700

      Drivers: hv: vmbus: Export the vmbus_sendpacket_pagebuffer_ctl()

  and for mainline it was

  commit 9eea92226407e7a117ef1ceef45380ebd000a0e2
  Author: Alexei Starovoitov <ast@xxxxxxxxxxxx>
  Date:   Mon May 11 15:19:48 2015 -0700

      pktgen: fix packet generation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1508706/+subscriptions


References