← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1800511] Re: VMs started before Rocky upgrade cannot be live migrated

 

FWIW I don't think
https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd
really changed how
https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877
could have broken anything. _update_vif_xml is called from the source
host using migrate data from the dest host, but as far as I know that
migrate data doesn't have any information about mtu from the dest to
determine what to set in the source vif config. Before _update_vif_xml,
we would have just sent the source guest xml vif config to the dest and
if the dest didn't support mtu it would have failed also.

** Tags added: libvirt live-migration upgrade

** Changed in: nova
   Importance: Undecided => High

** Changed in: nova
       Status: New => Triaged

** Also affects: nova/rocky
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1800511

Title:
  VMs started before Rocky upgrade cannot be live migrated

Status in OpenStack Compute (nova):
  Triaged
Status in OpenStack Compute (nova) rocky series:
  Triaged

Bug description:
  In Rocky, the following patch introduced adding MTU to the network for
  VMs:

  https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877

  However, this didn't affect live migrations much because Nova didn't
  touch the network bits of the XML during live migration, until this
  patch:

  https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd

  With that change, the MTU is added to the configuration, which means
  that the destination is launched with host_mtu=N, which apparently
  changes the guest ABI (see:
  https://bugzilla.redhat.com/show_bug.cgi?id=1449346).  This means the
  live migration will fail with an error looking like this:

  2018-10-29 14:59:15.126+0000: 5289: error : qemuProcessReportLogError:1914 : internal error: qemu unexpectedly closed the monitor: 2018-10-29T14:59:14.977084Z qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 61 device: 1 cmask: ff wmask: c0 w1cmask:0
  2018-10-29T14:59:14.977105Z qemu-kvm: Failed to load PCIDevice:config
  2018-10-29T14:59:14.977109Z qemu-kvm: Failed to load virtio-net:virtio
  2018-10-29T14:59:14.977112Z qemu-kvm: error while loading state for instance 0x0 of device ‘0000:00:03.0/virtio-net’
  2018-10-29T14:59:14.977283Z qemu-kvm: load of migration failed: Invalid argument

  I was able to further verify this by seeing that `host_mtu` exists in
  the command line when looking at the destination host instance logs in
  /var/log/libvirt/qemu/instance-foo.log

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1800511/+subscriptions


References