yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #75516
[Bug 1800511] Re: VMs started before Rocky upgrade cannot be live migrated
FWIW I don't think
https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd
really changed how
https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877
could have broken anything. _update_vif_xml is called from the source
host using migrate data from the dest host, but as far as I know that
migrate data doesn't have any information about mtu from the dest to
determine what to set in the source vif config. Before _update_vif_xml,
we would have just sent the source guest xml vif config to the dest and
if the dest didn't support mtu it would have failed also.
** Tags added: libvirt live-migration upgrade
** Changed in: nova
Importance: Undecided => High
** Changed in: nova
Status: New => Triaged
** Also affects: nova/rocky
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1800511
Title:
VMs started before Rocky upgrade cannot be live migrated
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) rocky series:
Triaged
Bug description:
In Rocky, the following patch introduced adding MTU to the network for
VMs:
https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877
However, this didn't affect live migrations much because Nova didn't
touch the network bits of the XML during live migration, until this
patch:
https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd
With that change, the MTU is added to the configuration, which means
that the destination is launched with host_mtu=N, which apparently
changes the guest ABI (see:
https://bugzilla.redhat.com/show_bug.cgi?id=1449346). This means the
live migration will fail with an error looking like this:
2018-10-29 14:59:15.126+0000: 5289: error : qemuProcessReportLogError:1914 : internal error: qemu unexpectedly closed the monitor: 2018-10-29T14:59:14.977084Z qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 61 device: 1 cmask: ff wmask: c0 w1cmask:0
2018-10-29T14:59:14.977105Z qemu-kvm: Failed to load PCIDevice:config
2018-10-29T14:59:14.977109Z qemu-kvm: Failed to load virtio-net:virtio
2018-10-29T14:59:14.977112Z qemu-kvm: error while loading state for instance 0x0 of device ‘0000:00:03.0/virtio-net’
2018-10-29T14:59:14.977283Z qemu-kvm: load of migration failed: Invalid argument
I was able to further verify this by seeing that `host_mtu` exists in
the command line when looking at the destination host instance logs in
/var/log/libvirt/qemu/instance-foo.log
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1800511/+subscriptions
References