← Back to team overview

kernel-packages team mailing list archive

[Bug 1534049] [NEW] Ubuntu 12.04 + QEmu 2.0 + KSM = 1 + OVS, makes Windows 2008 R2 guests to crash

 

Public bug reported:

hi,

Recently I met a platform case, troubled me for a long time, is there
anyone encountered this problem?

Environment are as follows:

Openstack environment build with fuel. 
Controller node: 3
Compute node: 30
Ceph node:9
windows virtio driver version : 61.71.104.10000
Ubuntu 12.04.4 LTS
QEMU emulator version 2.0.0 (Debian 2.0.0 + dfsg-2ubuntu1.9), Copyright (c) 2003-2008 Fabrice Bellard

root@node-96:~# ovs-vsctl --version
ovs-vsctl (Open vSwitch) 2.0.2
Compiled Nov 28 2014 21:37:07


Symptoms:
The guest of Windows virtual machines on one host occasional crash off and automatically restart. After the restart the network NIC is automatically disabled. Can't allocate ip address with dhcp. Soft reboot is not taking effect, only through hard reboot to restore the card back. 

Note:
1. The crashed Windows host focused on  a single physical node(HW RH2285), although  there are nodes with the same type of machines, but no similar problems to happened.
Maybe it is ovs's bug, cause windows vm received  irregularly packets, then resulting in windows nic crash out, later Windows system crash.
2. when windows vm crashed, there are several windows vm crash simultaneously. (about 3 or 4 not all of them)

At first i thought it was the problem of  Windows virtio drivers , but
the upgrade windows virtio driver is useless. It feels like qemu driver
problem. i am not sure about that.


Also, I'm not sure whether this bug and the following related. I have to follow the bellow case  turn off  the KSM parameters on HOST, currently in testing. If someone run into the same case, please reply.  Thanks.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346917
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1338277


dmesg log:
[13766077.712750] init: libvirt-bin main process (35678) killed by KILL signal
[13766077.712822] init: libvirt-bin main process ended, respawning
[13766081.675377] ip_set: protocol 6
[13770171.259174] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13770171.266161] device tape991247b-d0 left promiscuous mode
[13770171.266200] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13770203.296136] device tape991247b-d0 entered promiscuous mode
[13770203.329022] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13770203.329040] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13770204.527595] kvm: zapping shadow pages for mmio generation wraparound
[13771734.263654] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13771734.263704] qbre991247b-d0: port 1(qvbe991247b-d0) entered disabled state
[13771847.638690] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13771847.638742] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13771847.638758] qbre991247b-d0: port 1(qvbe991247b-d0) entered forwarding state
[13771847.638770] qbre991247b-d0: port 1(qvbe991247b-d0) entered forwarding state
[13784647.176340] qbr03992610-e3: port 1(qvb03992610-e3) entered disabled state
[13784668.538526] qbrc9002954-09: port 1(qvbc9002954-09) entered disabled state
[13792069.237135] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13792069.246187] device tape991247b-d0 left promiscuous mode
[13792069.246215] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13792070.174570] device tape991247b-d0 entered promiscuous mode
[13792070.207159] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13792070.207181] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13792071.041157] kvm: zapping shadow pages for mmio generation wraparound
[13794383.653582] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13794383.666387] device tape991247b-d0 left promiscuous mode
[13794383.666413] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
[13794384.468924] device tape991247b-d0 entered promiscuous mode
[13794384.501689] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
[13794384.501710] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state


/var/log/libvirt/qemu/instance-0000304b.log
qemu: terminating on signal 15 from pid 138887
2016-01-11 05:16:04.937+0000: shutting down
2016-01-11 05:16:05.709+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name instance-0000304b -S -machine pc-i440fx-trusty,accel=kvm,usb=off -cpu Westmere -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 4a6391d2-a26f-448b-a693-ba4b10d6ee6d -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2014.2,serial=fba59e94-bf97-4eda-9e43-d866b9eb1598,uuid=4a6391d2-a26f-448b-a693-ba4b10d6ee6d -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-0000304b.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:volumes/volume-3c6afa29-7cef-4881-823c-fe523e069bea:id=compute:key=AQA1s/RUaAoTLxAAQuBPsckd8J/j6RZ2AciIJA==:auth_supported=cephx\;none:mon_host=10.14.52.4\:6789\;10.14.52.5\:6789\;10.14.52.6\:6789,if=none,id=drive-virtio-disk0,format=raw,serial=3c6afa29-7cef-4881-823c-fe523e069bea,cache=none,bps=52428800,iops=3000 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=38 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:19:30:d5,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/4a6391d2-a26f-448b-a693-ba4b10d6ee6d/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:13 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
Domain id=171 is tainted: high-privileges
char device redirected to /dev/pts/19 (label charserial1)

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1534049

Title:
  Ubuntu 12.04 + QEmu 2.0 + KSM = 1 + OVS, makes Windows 2008 R2 guests
  to crash

Status in linux package in Ubuntu:
  New

Bug description:
  hi,

  Recently I met a platform case, troubled me for a long time, is there
  anyone encountered this problem?

  Environment are as follows:

  Openstack environment build with fuel. 
  Controller node: 3
  Compute node: 30
  Ceph node:9
  windows virtio driver version : 61.71.104.10000
  Ubuntu 12.04.4 LTS
  QEMU emulator version 2.0.0 (Debian 2.0.0 + dfsg-2ubuntu1.9), Copyright (c) 2003-2008 Fabrice Bellard

  root@node-96:~# ovs-vsctl --version
  ovs-vsctl (Open vSwitch) 2.0.2
  Compiled Nov 28 2014 21:37:07

  
  Symptoms:
  The guest of Windows virtual machines on one host occasional crash off and automatically restart. After the restart the network NIC is automatically disabled. Can't allocate ip address with dhcp. Soft reboot is not taking effect, only through hard reboot to restore the card back. 

  Note:
  1. The crashed Windows host focused on  a single physical node(HW RH2285), although  there are nodes with the same type of machines, but no similar problems to happened.
  Maybe it is ovs's bug, cause windows vm received  irregularly packets, then resulting in windows nic crash out, later Windows system crash.
  2. when windows vm crashed, there are several windows vm crash simultaneously. (about 3 or 4 not all of them)

  At first i thought it was the problem of  Windows virtio drivers , but
  the upgrade windows virtio driver is useless. It feels like qemu
  driver problem. i am not sure about that.

  
  Also, I'm not sure whether this bug and the following related. I have to follow the bellow case  turn off  the KSM parameters on HOST, currently in testing. If someone run into the same case, please reply.  Thanks.

  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1346917
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1338277


  dmesg log:
  [13766077.712750] init: libvirt-bin main process (35678) killed by KILL signal
  [13766077.712822] init: libvirt-bin main process ended, respawning
  [13766081.675377] ip_set: protocol 6
  [13770171.259174] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13770171.266161] device tape991247b-d0 left promiscuous mode
  [13770171.266200] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13770203.296136] device tape991247b-d0 entered promiscuous mode
  [13770203.329022] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13770203.329040] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13770204.527595] kvm: zapping shadow pages for mmio generation wraparound
  [13771734.263654] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13771734.263704] qbre991247b-d0: port 1(qvbe991247b-d0) entered disabled state
  [13771847.638690] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13771847.638742] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13771847.638758] qbre991247b-d0: port 1(qvbe991247b-d0) entered forwarding state
  [13771847.638770] qbre991247b-d0: port 1(qvbe991247b-d0) entered forwarding state
  [13784647.176340] qbr03992610-e3: port 1(qvb03992610-e3) entered disabled state
  [13784668.538526] qbrc9002954-09: port 1(qvbc9002954-09) entered disabled state
  [13792069.237135] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13792069.246187] device tape991247b-d0 left promiscuous mode
  [13792069.246215] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13792070.174570] device tape991247b-d0 entered promiscuous mode
  [13792070.207159] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13792070.207181] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13792071.041157] kvm: zapping shadow pages for mmio generation wraparound
  [13794383.653582] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13794383.666387] device tape991247b-d0 left promiscuous mode
  [13794383.666413] qbre991247b-d0: port 2(tape991247b-d0) entered disabled state
  [13794384.468924] device tape991247b-d0 entered promiscuous mode
  [13794384.501689] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state
  [13794384.501710] qbre991247b-d0: port 2(tape991247b-d0) entered forwarding state

  
  /var/log/libvirt/qemu/instance-0000304b.log
  qemu: terminating on signal 15 from pid 138887
  2016-01-11 05:16:04.937+0000: shutting down
  2016-01-11 05:16:05.709+0000: starting up
  LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name instance-0000304b -S -machine pc-i440fx-trusty,accel=kvm,usb=off -cpu Westmere -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid 4a6391d2-a26f-448b-a693-ba4b10d6ee6d -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2014.2,serial=fba59e94-bf97-4eda-9e43-d866b9eb1598,uuid=4a6391d2-a26f-448b-a693-ba4b10d6ee6d -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-0000304b.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:volumes/volume-3c6afa29-7cef-4881-823c-fe523e069bea:id=compute:key=AQA1s/RUaAoTLxAAQuBPsckd8J/j6RZ2AciIJA==:auth_supported=cephx\;none:mon_host=10.14.52.4\:6789\;10.14.52.5\:6789\;10.14.52.6\:6789,if=none,id=drive-virtio-disk0,format=raw,serial=3c6afa29-7cef-4881-823c-fe523e069bea,cache=none,bps=52428800,iops=3000 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=38 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:19:30:d5,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/4a6391d2-a26f-448b-a693-ba4b10d6ee6d/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0 -vnc 0.0.0.0:13 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
  Domain id=171 is tainted: high-privileges
  char device redirected to /dev/pts/19 (label charserial1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1534049/+subscriptions


Follow ups