← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1800511] [NEW] VMs started before Rocky upgrade cannot be live migrated

 

Public bug reported:

In Rocky, the following patch introduced adding MTU to the network for
VMs:

https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877

However, this didn't affect live migrations much because Nova didn't
touch the network bits of the XML during live migration, until this
patch:

https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd

With that change, the MTU is added to the configuration, which means
that the destination is launched with host_mtu=N, which apparently
changes the guest ABI (see:
https://bugzilla.redhat.com/show_bug.cgi?id=1449346).  This means the
live migration will fail with an error looking like this:

2018-10-29 14:59:15.126+0000: 5289: error : qemuProcessReportLogError:1914 : internal error: qemu unexpectedly closed the monitor: 2018-10-29T14:59:14.977084Z qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 61 device: 1 cmask: ff wmask: c0 w1cmask:0
2018-10-29T14:59:14.977105Z qemu-kvm: Failed to load PCIDevice:config
2018-10-29T14:59:14.977109Z qemu-kvm: Failed to load virtio-net:virtio
2018-10-29T14:59:14.977112Z qemu-kvm: error while loading state for instance 0x0 of device ‘0000:00:03.0/virtio-net’
2018-10-29T14:59:14.977283Z qemu-kvm: load of migration failed: Invalid argument

I was able to further verify this by seeing that `host_mtu` exists in
the command line when looking at the destination host instance logs in
/var/log/libvirt/qemu/instance-foo.log

** Affects: nova
     Importance: High
     Assignee: Mohammed Naser (mnaser)
         Status: Triaged


** Tags: libvirt live-migration upgrade

** Changed in: nova
     Assignee: (unassigned) => Mohammed Naser (mnaser)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1800511

Title:
  VMs started before Rocky upgrade cannot be live migrated

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  In Rocky, the following patch introduced adding MTU to the network for
  VMs:

  https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877

  However, this didn't affect live migrations much because Nova didn't
  touch the network bits of the XML during live migration, until this
  patch:

  https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd

  With that change, the MTU is added to the configuration, which means
  that the destination is launched with host_mtu=N, which apparently
  changes the guest ABI (see:
  https://bugzilla.redhat.com/show_bug.cgi?id=1449346).  This means the
  live migration will fail with an error looking like this:

  2018-10-29 14:59:15.126+0000: 5289: error : qemuProcessReportLogError:1914 : internal error: qemu unexpectedly closed the monitor: 2018-10-29T14:59:14.977084Z qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 61 device: 1 cmask: ff wmask: c0 w1cmask:0
  2018-10-29T14:59:14.977105Z qemu-kvm: Failed to load PCIDevice:config
  2018-10-29T14:59:14.977109Z qemu-kvm: Failed to load virtio-net:virtio
  2018-10-29T14:59:14.977112Z qemu-kvm: error while loading state for instance 0x0 of device ‘0000:00:03.0/virtio-net’
  2018-10-29T14:59:14.977283Z qemu-kvm: load of migration failed: Invalid argument

  I was able to further verify this by seeing that `host_mtu` exists in
  the command line when looking at the destination host instance logs in
  /var/log/libvirt/qemu/instance-foo.log

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1800511/+subscriptions


Follow ups