← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1800511] Re: VMs with vif_type bridge/tap started before Rocky upgrade cannot be live migrated

 

Reviewed:  https://review.openstack.org/614008
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=643f53f5e9544d6c98833b1ae3dd472602118a1f
Submitter: Zuul
Branch:    master

commit 643f53f5e9544d6c98833b1ae3dd472602118a1f
Author: Mohammed Naser <mnaser@xxxxxxxxxxxx>
Date:   Mon Oct 29 19:49:41 2018 +0100

    libvirt: Avoid setting MTU during live migration if unset
    
    If there is a live migration of an instance that was launched
    before change Iecc265fb25e88fa00a66f1fd38e215cad53e7669, it
    would not have an mtu set and therefore it wouldn't have it in
    the XML.
    
    When live migrating, the mtu is added which changes the
    guest ABI[1], causing the live migration to fail.  The failure
    occurs when trying to live migrate an instance that:
    
    - Launched before change Iecc265fb25e88fa00a66f1fd38e215cad53e7669
    - It has not been rebooted (i.e. XML has not changed since)
    - It's using bridge/tap networking
    - Migration attempted after change Iecc265fb25e88fa00a66f1fd38e215cad53e7669
    
    This patch prevents this by avoiding setting MTU if the running
    instance does not have one configured in its domain XML.
    
    [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1449346
    
    Closes-Bug #1800511
    Change-Id: I6e2e6437a7c826dc425d8b353c38670d6eece0b5


** Changed in: nova
       Status: In Progress => Fix Released

** Bug watch added: Red Hat Bugzilla #1449346
   https://bugzilla.redhat.com/show_bug.cgi?id=1449346

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1800511

Title:
  VMs with vif_type bridge/tap started before Rocky upgrade cannot be
  live migrated

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) rocky series:
  In Progress

Bug description:
  In Rocky, the following patch introduced adding MTU to the network for
  VMs:

  https://github.com/openstack/nova/commit/f02b3800051234ecc14f3117d5987b1a8ef75877

  However, this didn't affect live migrations much because Nova didn't
  touch the network bits of the XML during live migration, until this
  patch:

  https://github.com/openstack/nova/commit/2b52cde565d542c03f004b48ee9c1a6a25f5b7cd

  With that change, the MTU is added to the configuration, which means
  that the destination is launched with host_mtu=N, which apparently
  changes the guest ABI (see:
  https://bugzilla.redhat.com/show_bug.cgi?id=1449346).  This means the
  live migration will fail with an error looking like this:

  2018-10-29 14:59:15.126+0000: 5289: error : qemuProcessReportLogError:1914 : internal error: qemu unexpectedly closed the monitor: 2018-10-29T14:59:14.977084Z qemu-kvm: get_pci_config_device: Bad config data: i=0x10 read: 61 device: 1 cmask: ff wmask: c0 w1cmask:0
  2018-10-29T14:59:14.977105Z qemu-kvm: Failed to load PCIDevice:config
  2018-10-29T14:59:14.977109Z qemu-kvm: Failed to load virtio-net:virtio
  2018-10-29T14:59:14.977112Z qemu-kvm: error while loading state for instance 0x0 of device ‘0000:00:03.0/virtio-net’
  2018-10-29T14:59:14.977283Z qemu-kvm: load of migration failed: Invalid argument

  I was able to further verify this by seeing that `host_mtu` exists in
  the command line when looking at the destination host instance logs in
  /var/log/libvirt/qemu/instance-foo.log

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1800511/+subscriptions


References