← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2039803] Re: compareHypervisorCPU() incompatibility during live migration

 

Added charm-nova-compute to the bug, we will need to introduce a config
option in it to toggle  skip_cpu_compare_at_startup and
skip_cpu_compare_on_dest config options.

** Also affects: charm-nova-compute
   Importance: Undecided
       Status: New

** Also affects: charm-nova-compute/2024.1
   Importance: Undecided
       Status: New

** Changed in: charm-nova-compute
       Status: New => Triaged

** Changed in: charm-nova-compute
     Assignee: (unassigned) => Bryan Fraschetti (bryanfraschetti)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2039803

Title:
  compareHypervisorCPU() incompatibility during live migration

Status in OpenStack Nova Compute Charm:
  Triaged
Status in OpenStack Nova Compute Charm 2024.1 series:
  New
Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Description
  ===========
  Live migration fails with 

  Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 9636, in check_can_live_migrate_destination
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server     self._compare_cpu(None, source_cpu_info, instance)
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 10013, in _compare_cpu
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server     raise exception.InvalidCPUInfo(reason=m % {'ret': ret, 'u': u})
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server nova.exception.InvalidCPUInfo: Unacceptable CPU info: CPU doesn't have compatibility.

  [...]

  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server   File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 9640, in check_can_live_migrate_destination
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server     raise exception.MigrationPreCheckError(reason=e)
  2023-10-17 08:28:15.301 2 ERROR oslo_messaging.rpc.server nova.exception.MigrationPreCheckError: Migration pre-check error: Unacceptable CPU info: CPU doesn't have compatibility.

  
  If skip_cpu_compare_on_dest is set to True the the live migration succeeds. So the issue seems to be only in the check nova does and the hypervisors are actually compatible.

  
  Steps to reproduce
  ==================
  * boot a simple cirros VM
  * openstack server migrate --live --block-migration <vm>

  
  Environment
  ===========
  OpenStack: 2023.1.
  libvirt version: 9.5.0
  QEMU: 8.1.0
  Hypervisors: two centos stream 9 VMs with nested KVM enabled
  nova compute is configured with cpu_mode=host-model

  Triage
  ======

  
  During the pre_live_migration check running on the destination node nova sees that in the DB the guest has no vcpu_model set and therefore falls back to do host CPU model based comparison[1]. The host cpu_info used there is collected with the getCapabilities() from libvirt [2]. And in this system that returns SandyBridge. In the other hand the guest VM is running as Broadwell (note nova is configured with cpu_mode=host-model) and also virsh domcapabilities returns Broadwell as the host model.

  There are two reasons for the failure:
  1) nova uses getCapabilities() to determine the host CPU model but use the model from the domCapabilities for the guest VM using host-model. According to the libvirt maintainers nova should never use getCapabilities for anything any more.

  2) nova falls back to do a host CPU based comparison if the guest
  vcpu_model is not filled in the nova DB. But for live migration the
  guest CPU model should be available as the guest exists and running on
  the source node.

  
  [1] https://github.com/openstack/nova/blob/a869ab17c095cbff2c942ab94247b0c30723b230/nova/virt/libvirt/driver.py#L9960-L9975
  [2] https://github.com/openstack/nova/blob/a869ab17c095cbff2c942ab94247b0c30723b230/nova/virt/libvirt/host.py#L793-L796

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/2039803/+subscriptions



References