← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1898715] Re: Live migration fails despite matching CPUs

 

Reviewed:  https://review.opendev.org/757577
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=eeeca4ceff576beaa8558360c8a6a165d716f996
Submitter: Zuul
Branch:    master

commit eeeca4ceff576beaa8558360c8a6a165d716f996
Author: Andrew Bonney <andrew.bonney@xxxxxxxxx>
Date:   Tue Oct 6 14:42:38 2020 +0100

    Handle disabled CPU features to fix live migration failures
    
    When performing a live migration between hypervisors running
    libvirt, where one or more CPU features are disabled, nova does
    not take account of these. This results in migration failures
    as none of the available hypervisor targets appear compatible.
    
    This patch ensures that the libvirt 'disable' poicy is taken
    account of, at least in a basic sense, by explicitly ignoring
    items flagged in this way when enumerating CPU features.
    
    Closes-Bug: #1898715
    Change-Id: Iaf14ca97cfac99dd280d1114123f2d4bb6292b63


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1898715

Title:
  Live migration fails despite matching CPUs

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Having upgraded to Ussuri, we've noted that live migrations now always
  fail across our hosts with newer Intel CPUs (identified by libvirt as
  Cascadelake-Server-noTSX).

  When processing the CPU's features, the calls made by Nova to libvirt
  appear to result in an XML segment which includes 'policy' keys for
  each feature which may be set to 'disable'. When Nova interprets this
  (see
  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L699
  and
  https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L670)
  the 'policy' key does not appear to be handled, resulting in more CPU
  features being recorded against the hypervisor than it really has.

  When a live migration is scheduled, these additional feature
  requirements are then passed to the remote host which compares with
  its running features and identifies they are incompatible, despite the
  CPUs being identical. As a result we're currently unable to live
  migrate any VMs between hosts which use these CPUs.

  Further debug output is included in
  http://paste.openstack.org/show/798740/

  Nova stable/ussuri 7d556106bfd3e64860dc26226d364876e8bce43c
  Ubuntu 18.04
  libvirt 6.0.0-0ubuntu8.2~cloud0

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1898715/+subscriptions


References