← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1843058] [NEW] libvirt live migration fails intermittently in grenade live migration job with "error while loading state for instance 0x0 of device 'kvm-tpr-opt'"

 

Public bug reported:

This may be related to bug 1838309 but I'm not sure so I'm reporting it
separately so we can track it in elastic-recheck. This is the traceback
in the nova-compute logs:

Sep 06 01:28:11.837685 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: DEBUG nova.virt.libvirt.driver [None req-e6bcaa2e-aa66-4107-b0c6-9b3976d45c76 None None] [instance: 64689c1f-27b6-4889-8206-3bc458427197] Migration operation thread notification {{(pid=3855) thread_finished /opt/stack/old/nova/nova/virt/libvirt/driver.py:8039}}
Sep 06 01:28:11.838031 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: Traceback (most recent call last):
Sep 06 01:28:11.838031 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460, in fire_timers
Sep 06 01:28:11.838282 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     timer()
Sep 06 01:28:11.838282 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 59, in __call__
Sep 06 01:28:11.838561 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     cb(*args, **kw)
Sep 06 01:28:11.838561 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 175, in _do_send
Sep 06 01:28:11.838774 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     waiter.switch(result)
Sep 06 01:28:11.838774 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 219, in main
Sep 06 01:28:11.839008 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     result = function(*args, **kwargs)
Sep 06 01:28:11.839008 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/utils.py", line 800, in context_wrapper
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     return func(*args, **kwargs)
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 7711, in _live_migration_operation
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     LOG.error("Live Migration failure: %s", e, instance=instance)
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     self.force_reraise()
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     six.reraise(self.type_, self.value, self.tb)
Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 7704, in _live_migration_operation
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     bandwidth=CONF.libvirt.live_migration_bandwidth)
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/virt/libvirt/guest.py", line 682, in migrate
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     destination, params=params, flags=flags)
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 190, in doit
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     result = proxy_call(self._autowrap, f, *args, **kwargs)
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 148, in proxy_call
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     rv = execute(f, *args, **kwargs)
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 129, in execute
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     six.reraise(c, e, tb)
Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
Sep 06 01:28:11.841508 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     rv = meth(*args, **kwargs)
Sep 06 01:28:11.841508 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1745, in migrateToURI3
Sep 06 01:28:11.843021 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
Sep 06 01:28:11.843239 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-09-06T01:28:10.940146Z qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
Sep 06 01:28:11.843239 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: 2019-09-06T01:28:11.706713Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'kvm-tpr-opt'
Sep 06 01:28:11.843239 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: 2019-09-06T01:28:11.707623Z qemu-system-x86_64: load of migration failed: Operation not permitted

This is the guest qemu log:

2019-09-06 01:27:54.606+0000: initiating migration
2019-09-06 01:27:55.378+0000: shutting down, reason=migrated
2019-09-06T01:27:55.379158Z qemu-system-x86_64: terminating on signal 15 from pid 32389 (/usr/sbin/libvirtd)
2019-09-06 01:28:10.699+0000: starting up libvirt version: 4.0.0, package: 1ubuntu8.12 (Marc Deslauriers <marc.deslauriers@xxxxxxxxxx> Tue, 02 Jul 2019 09:19:33 -0400), qemu version: 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.18), hostname: ubuntu-bionic-rax-iad-0010857256
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -name guest=instance-00000021,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-29-instance-00000021/master-key.aes -machine pc-i440fx-2.11,accel=tcg,usb=off,dump-guest-core=off -m 64 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 64689c1f-27b6-4889-8206-3bc458427197 -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=19.1.0,serial=64689c1f-27b6-4889-8206-3bc458427197,uuid=64689c1f-27b6-4889-8206-3bc458427197,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-29-instance-00000021/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -object secret,id=virtio-disk0-secret0,data=JSeIib66AdJhXaNumn38Lk8ElHKyPOTCOwjG2uQiX6Q=,keyid=masterKey0,iv=OwpOylx/6ARMhASECPtqwg==,format=base64 -drive 'file=rbd:vms/64689c1f-27b6-4889-8206-3bc458427197_disk:id=cinder:auth_supported=cephx\;none:mon_host=10.208.226.92\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=29,id=hostnet0 -device virtio-net-pci,host_mtu=1400,netdev=hostnet0,id=net0,mac=fa:16:3e:6b:59:0f,bus=pci.0,addr=0x3 -add-fd set=1,fd=32 -chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:1 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
2019-09-06T01:28:10.842062Z qemu-system-x86_64: -chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on: char device redirected to /dev/pts/1 (label charserial0)
2019-09-06T01:28:10.940146Z qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
2019-09-06T01:28:11.706713Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'kvm-tpr-opt'
2019-09-06T01:28:11.707623Z qemu-system-x86_64: load of migration failed: Operation not permitted
2019-09-06 01:28:11.743+0000: shutting down, reason=failed

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22error%20while%20loading%20state%20for%20instance%5C%22%20AND%20message%3A%5C%22of%20device%20
'kvm-tpr-opt'%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

4 hits in 7 days so it seems pretty rare.

** Affects: nova
     Importance: Undecided
         Status: Confirmed


** Tags: grenade libvirt live-migration

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1843058

Title:
  libvirt live migration fails intermittently in grenade live migration
  job with "error while loading state for instance 0x0 of device 'kvm-
  tpr-opt'"

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  This may be related to bug 1838309 but I'm not sure so I'm reporting
  it separately so we can track it in elastic-recheck. This is the
  traceback in the nova-compute logs:

  Sep 06 01:28:11.837685 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: DEBUG nova.virt.libvirt.driver [None req-e6bcaa2e-aa66-4107-b0c6-9b3976d45c76 None None] [instance: 64689c1f-27b6-4889-8206-3bc458427197] Migration operation thread notification {{(pid=3855) thread_finished /opt/stack/old/nova/nova/virt/libvirt/driver.py:8039}}
  Sep 06 01:28:11.838031 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: Traceback (most recent call last):
  Sep 06 01:28:11.838031 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460, in fire_timers
  Sep 06 01:28:11.838282 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     timer()
  Sep 06 01:28:11.838282 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 59, in __call__
  Sep 06 01:28:11.838561 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     cb(*args, **kw)
  Sep 06 01:28:11.838561 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 175, in _do_send
  Sep 06 01:28:11.838774 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     waiter.switch(result)
  Sep 06 01:28:11.838774 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 219, in main
  Sep 06 01:28:11.839008 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     result = function(*args, **kwargs)
  Sep 06 01:28:11.839008 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/utils.py", line 800, in context_wrapper
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     return func(*args, **kwargs)
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 7711, in _live_migration_operation
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     LOG.error("Live Migration failure: %s", e, instance=instance)
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     self.force_reraise()
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     six.reraise(self.type_, self.value, self.tb)
  Sep 06 01:28:11.839688 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/virt/libvirt/driver.py", line 7704, in _live_migration_operation
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     bandwidth=CONF.libvirt.live_migration_bandwidth)
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/opt/stack/old/nova/nova/virt/libvirt/guest.py", line 682, in migrate
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     destination, params=params, flags=flags)
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 190, in doit
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     result = proxy_call(self._autowrap, f, *args, **kwargs)
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 148, in proxy_call
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     rv = execute(f, *args, **kwargs)
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 129, in execute
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     six.reraise(c, e, tb)
  Sep 06 01:28:11.840435 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
  Sep 06 01:28:11.841508 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     rv = meth(*args, **kwargs)
  Sep 06 01:28:11.841508 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:   File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1745, in migrateToURI3
  Sep 06 01:28:11.843021 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]:     if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self)
  Sep 06 01:28:11.843239 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-09-06T01:28:10.940146Z qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
  Sep 06 01:28:11.843239 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: 2019-09-06T01:28:11.706713Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'kvm-tpr-opt'
  Sep 06 01:28:11.843239 ubuntu-bionic-rax-iad-0010857269 nova-compute[3855]: 2019-09-06T01:28:11.707623Z qemu-system-x86_64: load of migration failed: Operation not permitted

  This is the guest qemu log:

  2019-09-06 01:27:54.606+0000: initiating migration
  2019-09-06 01:27:55.378+0000: shutting down, reason=migrated
  2019-09-06T01:27:55.379158Z qemu-system-x86_64: terminating on signal 15 from pid 32389 (/usr/sbin/libvirtd)
  2019-09-06 01:28:10.699+0000: starting up libvirt version: 4.0.0, package: 1ubuntu8.12 (Marc Deslauriers <marc.deslauriers@xxxxxxxxxx> Tue, 02 Jul 2019 09:19:33 -0400), qemu version: 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.18), hostname: ubuntu-bionic-rax-iad-0010857256
  LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -name guest=instance-00000021,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-29-instance-00000021/master-key.aes -machine pc-i440fx-2.11,accel=tcg,usb=off,dump-guest-core=off -m 64 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 64689c1f-27b6-4889-8206-3bc458427197 -smbios 'type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=19.1.0,serial=64689c1f-27b6-4889-8206-3bc458427197,uuid=64689c1f-27b6-4889-8206-3bc458427197,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-29-instance-00000021/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -object secret,id=virtio-disk0-secret0,data=JSeIib66AdJhXaNumn38Lk8ElHKyPOTCOwjG2uQiX6Q=,keyid=masterKey0,iv=OwpOylx/6ARMhASECPtqwg==,format=base64 -drive 'file=rbd:vms/64689c1f-27b6-4889-8206-3bc458427197_disk:id=cinder:auth_supported=cephx\;none:mon_host=10.208.226.92\:6789,file.password-secret=virtio-disk0-secret0,format=raw,if=none,id=drive-virtio-disk0,cache=writeback' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=29,id=hostnet0 -device virtio-net-pci,host_mtu=1400,netdev=hostnet0,id=net0,mac=fa:16:3e:6b:59:0f,bus=pci.0,addr=0x3 -add-fd set=1,fd=32 -chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on -device isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:1 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on
  2019-09-06T01:28:10.842062Z qemu-system-x86_64: -chardev pty,id=charserial0,logfile=/dev/fdset/1,logappend=on: char device redirected to /dev/pts/1 (label charserial0)
  2019-09-06T01:28:10.940146Z qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
  2019-09-06T01:28:11.706713Z qemu-system-x86_64: error while loading state for instance 0x0 of device 'kvm-tpr-opt'
  2019-09-06T01:28:11.707623Z qemu-system-x86_64: load of migration failed: Operation not permitted
  2019-09-06 01:28:11.743+0000: shutting down, reason=failed

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22error%20while%20loading%20state%20for%20instance%5C%22%20AND%20message%3A%5C%22of%20device%20
  'kvm-tpr-
  opt'%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

  4 hits in 7 days so it seems pretty rare.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1843058/+subscriptions