kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #172027
[Bug 1570195] Re: Net tools cause kernel soft lockup after DPDK touched VirtIO-pci devices
Since ftrace failed me I switched to gdb via the qemu -s parameter.
Debuginfo and source of guest kernel on the Host:
sudo apt-get install linux-tools-4.4.0-18-dbgsym
sudo pull-lp-source linux 4.4.0-18.34
sudo mkdir -p /build/linux-XwpX40; sudo ln -s /home/ubuntu/linux-4.4.0 /build/linux-XwpX40/linux-4.4.0
Edit that into the guest and restart:
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
<qemu:commandline>
<qemu:arg value='-s'/>
</qemu:commandline>
gdb /usr/lib/debug/boot/vmlinux-4.4.0-18-generic
b dev_ethtool
b ethtool_set_channels
b virtnet_set_channels
b virtnet_set_queues
Then on the guest run
sudo /usr/bin/testpmd --pci-blacklist 0000:00:03.0 --socket-mem 2048 -- --interactive --total-num-mbufs=2048
Attach gdb with
target remote :1234
Then on the guest trigger the bug
sudo ethtool -L eth1 combined 3
It is really "hanging" on that virtnet_send_command called from there.
As expected the loop never breaks.
1010 /* Spin for a response, the kick causes an ioport write, trapping
1011 * into the hypervisor, so the request should be handled immediately.
1012 */
1013 while (!virtqueue_get_buf(vi->cvq, &tmp) &&
1014 !virtqueue_is_broken(vi->cvq))
1015 cpu_relax();
1016
1017 return vi->ctrl_status == VIRTIO_NET_OK;
(gdb) n
1014 !virtqueue_is_broken(vi->cvq))
(gdb)
1013 while (!virtqueue_get_buf(vi->cvq, &tmp) &&
(gdb)
1015 cpu_relax();
[...]
Infinite loop.
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1570195
Title:
Net tools cause kernel soft lockup after DPDK touched VirtIO-pci
devices
Status in dpdk package in Ubuntu:
Confirmed
Status in linux package in Ubuntu:
Confirmed
Bug description:
Guys,
I'm facing an issue here with both "ethtool" and "ip", while trying
to manage black-listed by DPDK PCI VirtIO devices.
You'll need an Ubuntu Xenial KVM guest, with 4 VirtIO vNIC cards, to
run those tests
PCI device example from inside a Xenial guest:
---
# lspci | grep Ethernet
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
00:04.0 Ethernet controller: Red Hat, Inc Virtio network device
00:05.0 Ethernet controller: Red Hat, Inc Virtio network device
00:06.0 Ethernet controller: Red Hat, Inc Virtio network device
---
Where "ens3" is the first / default interface, attached to Libvirt's
"default" network. The "ens4" is reserved for "ethtool / ip" tests
(attached to another Libvirt's network without IPs or DHCP), "ens5"
will be "dpdk0" and "ens6" "dpdk1"...
---
*** How it works?
1- For example, try to enable multi-queue on DPDK's devices, boot
your Xenial guest, and run:
ethtool -L ens5 combined 4
ethtool -L ens6 combined 4
2- Install openvswitch-switch-dpdk configure DPDK and OVS and fire it
up.
https://help.ubuntu.com/16.04/serverguide/DPDK.html
service openvswitch-switch stop
service dpdk stop
OVS DPDK Options (/etc/default/openvswitch-switch):
--
DPDK_OPTS='--dpdk -c 0x1 -n 4 --socket-mem 1024 --pci-blacklist 0000:00:03.0,0000:00:04.0'
--
service dpdk start
service openvswitch-switch start
- Enable multi-queue on OVS+DPDK inside of the VM:
ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=4
ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xff00
* Multi-queue apparently works! ovs-vswitchd consumes more that 100%
of CPU, meaning that it multi-queue is there...
*** Where it fails?
1- Reboot the VM and try to run ethtool again (or go straight to 2
below):
ethtool -L ens5 combined 4
2- Try to fire up ens4:
ip link set dev ens4 up
# FAIL! Both commands hangs, consuming 100% of guest's CPU...
So, it looks like a Linux fault, because it is "allowing" the DPDK
VirtIO App (a user land App), to interfere with kernel devices in a
strange way...
Best,
Thiago
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-18-generic 4.4.0-18.34
ProcVersionSignature: Ubuntu 4.4.0-18.34-generic 4.4.6
Uname: Linux 4.4.0-18-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Apr 14 00:35 seq
crw-rw---- 1 root audio 116, 33 Apr 14 00:35 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.1-0ubuntu1
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: [Errno 2] No such file or directory: 'fuser'
CRDA: N/A
Date: Thu Apr 14 01:27:27 2016
HibernationDevice: RESUME=UUID=833e999c-e066-433c-b8a2-4324bb8d56de
InstallationDate: Installed on 2016-04-07 (7 days ago)
InstallationMedia: Ubuntu-Server 16.04 LTS "Xenial Xerus" - Beta amd64 (20160406)
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb:
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
PciMultimedia:
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-18-generic root=UUID=9911604e-353b-491f-a0a9-804724350592 ro
RelatedPackageVersions:
linux-restricted-modules-4.4.0-18-generic N/A
linux-backports-modules-4.4.0-18-generic N/A
linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: Ubuntu-1.8.2-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-wily
dmi.modalias: dmi:bvnSeaBIOS:bvrUbuntu-1.8.2-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-wily:cvnQEMU:ct1:cvrpc-i440fx-wily:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-wily
dmi.sys.vendor: QEMU
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1570195/+subscriptions
References