← Back to team overview

kernel-packages team mailing list archive

[Bug 1570195] Re: Net tools cause kernel soft lockup after DPDK touched VirtIO-pci devices

 

Xenial is released, so we are back in SRU mode.
Therefore I add the matching SRU Template for the upload of 2.2.0ubuntu8 which is in the unapproved queue atm.

[Impact]

 * using devices by DPDK and the kernel at once drives the system into hangs
 * the fix avoids using devices in DPDK that are still in use by the kernel
 * fix is a backport form upstream accepted patch

[Test Case]

 * run dpdk in a guest on virtio-pci devices
 * afterwards do anything that touches the queues of the device like ethtool -L

[Regression Potential]

 * Some existing setups might no more work if they set up DPDK on kernel owned devices. But that is intentional as they are only one step away from breaking their systems
 * The documentation in the server guide has been adapted to reflect the new needs (merge proposal waits for ack)
 * also the comments and examples in the config files have been adapted to reflect the new style
  * passed ADT tests on i368/amd64/amd64-lowmem and our full CI (https://code.launchpad.net/~ubuntu-server/ubuntu/+source/dpdk-testing/+git/dpdk-testing)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1570195

Title:
  Net tools cause kernel soft lockup after DPDK touched  VirtIO-pci
  devices

Status in dpdk package in Ubuntu:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Guys,

   I'm facing an issue here with both "ethtool" and "ip", while trying
  to manage black-listed by DPDK PCI VirtIO devices.

   You'll need an Ubuntu Xenial KVM guest, with 4 VirtIO vNIC cards, to
  run those tests

   PCI device example from inside a Xenial guest:

  ---
  # lspci | grep Ethernet
  00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
  00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
  00:05.0 Ethernet controller: Red Hat, Inc Virtio network device
  00:06.0 Ethernet controller: Red Hat, Inc Virtio network device
  ---

  Where "ens3" is the first / default interface, attached to Libvirt's
  "default" network. The "ens4" is reserved for "ethtool / ip" tests
  (attached to another Libvirt's network without IPs or DHCP), "ens5"
  will be "dpdk0" and "ens6" "dpdk1"...

  ---
   *** How it works?

   1- For example, try to enable multi-queue on DPDK's devices, boot
  your Xenial guest, and run:

   ethtool -L ens5 combined 4
   ethtool -L ens6 combined 4

   2- Install openvswitch-switch-dpdk configure DPDK and OVS and fire it
  up.

   https://help.ubuntu.com/16.04/serverguide/DPDK.html

   service openvswitch-switch stop
   service dpdk stop

   OVS DPDK Options (/etc/default/openvswitch-switch):

  --
  DPDK_OPTS='--dpdk -c 0x1 -n 4 --socket-mem 1024 --pci-blacklist 0000:00:03.0,0000:00:04.0'
  --

   service dpdk start
   service openvswitch-switch start

   - Enable multi-queue on OVS+DPDK inside of the VM:

   ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=4
   ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xff00

   * Multi-queue apparently works! ovs-vswitchd consumes more that 100%
  of CPU, meaning that it multi-queue is there...

   *** Where it fails?

   1- Reboot the VM and try to run ethtool again (or go straight to 2
  below):

   ethtool -L ens5 combined 4

   2- Try to fire up ens4:

   ip link set dev ens4 up

  
   # FAIL! Both commands hangs, consuming 100% of guest's CPU...

   So, it looks like a Linux fault, because it is "allowing" the DPDK
  VirtIO App (a user land App), to interfere with kernel devices in a
  strange way...

  Best,
  Thiago

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: linux-image-4.4.0-18-generic 4.4.0-18.34
  ProcVersionSignature: Ubuntu 4.4.0-18.34-generic 4.4.6
  Uname: Linux 4.4.0-18-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Apr 14 00:35 seq
   crw-rw---- 1 root audio 116, 33 Apr 14 00:35 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.1-0ubuntu1
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: [Errno 2] No such file or directory: 'fuser'
  CRDA: N/A
  Date: Thu Apr 14 01:27:27 2016
  HibernationDevice: RESUME=UUID=833e999c-e066-433c-b8a2-4324bb8d56de
  InstallationDate: Installed on 2016-04-07 (7 days ago)
  InstallationMedia: Ubuntu-Server 16.04 LTS "Xenial Xerus" - Beta amd64 (20160406)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb:
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
   Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
   Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
  MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
  PciMultimedia:
   
  ProcFB: 0 VESA VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-18-generic root=UUID=9911604e-353b-491f-a0a9-804724350592 ro
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-18-generic N/A
   linux-backports-modules-4.4.0-18-generic  N/A
   linux-firmware                            N/A
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/01/2014
  dmi.bios.vendor: SeaBIOS
  dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
  dmi.chassis.type: 1
  dmi.chassis.vendor: QEMU
  dmi.chassis.version: pc-i440fx-wily
  dmi.modalias: dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-wily:cvnQEMU:ct1:cvrpc-i440fx-wily:
  dmi.product.name: Standard PC (i440FX + PIIX, 1996)
  dmi.product.version: pc-i440fx-wily
  dmi.sys.vendor: QEMU

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1570195/+subscriptions


References