yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88950
[Bug 1913716] Re: Live-migrating an instance from 'Queens' (CentOS-7) to 'Train' (CentOS-8) fails during libvirt's compareCPU() check
Reviewed: https://review.opendev.org/c/openstack/nova/+/838926
Committed: https://opendev.org/openstack/nova/commit/267a40663cd8d0b94bbc5ebda4ece55a45753b64
Submitter: "Zuul (22348)"
Branch: master
commit 267a40663cd8d0b94bbc5ebda4ece55a45753b64
Author: Kashyap Chamarthy <kchamart@xxxxxxxxxx>
Date: Thu Jan 28 16:35:10 2021 +0100
libvirt: Add a workaround to skip compareCPU() on destination
Nova's use of libvirt's compareCPU() API served its purpose
over the years, but its design limitations break live migration in
subtle ways. For example, the compareCPU() API compares against the
host physical CPUID. Some of the features from this CPUID aren not
exposed by KVM, and then there are some features that KVM emulates that
are not in the host CPUID. The latter can cause bogus live migration
failures.
With QEMU >=2.9 and libvirt >= 4.4.0, libvirt will do the right thing in
terms of CPU compatibility checks on the destination host during live
migration. Nova satisfies these minimum version requirements by a good
margin. So, provide a workaround to skip the CPU comparison check on
the destination host before migrating a guest, and let libvirt handle it
correctly. This workaround will be removed once Nova replaces the older
libvirt APIs with their newer and improved counterparts[1][2].
- - -
Note that Nova's libvirt driver calls compareCPU() in another method,
_check_cpu_compatibility(); I did not remove its usage yet. As it needs
more careful combing of the code, and then:
- where possible, remove the usage of compareCPU() altogether, and
rely on libvirt doing the right thing under the hood; or
- where Nova _must_ do the CPU comparison checks, switch to the better
libvirt CPU APIs -- baselineHypervisorCPU() and
compareHypervisorCPU() -- that are described here[1]. This is work
in progress[2].
[1] https://opendev.org/openstack/nova-specs/commit/70811da221035044e27
[2] https://review.opendev.org/q/topic:bp%252Fcpu-selection-with-hypervisor-consideration
Change-Id: I444991584118a969e9ea04d352821b07ec0ba88d
Closes-Bug: #1913716
Signed-off-by: Kashyap Chamarthy <kchamart@xxxxxxxxxx>
Signed-off-by: Balazs Gibizer <bgibizer@xxxxxxxxxx>
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1913716
Title:
Live-migrating an instance from 'Queens' (CentOS-7) to 'Train'
(CentOS-8) fails during libvirt's compareCPU() check
Status in OpenStack Compute (nova):
Fix Released
Bug description:
[This bug was originally reported by Lukas Bezdicka when testing Red
Hat's OpenStack (OSP); but this should be reproducible in upstream
context as well. I'm writing this report based on the root cause
analysis in the environment where the bug occcurred. Thanks to Daniel
Berrangé for the debugging help!]
Description
-----------
Live-migrating a guest from 'Queens' compute node (running CentOS 7) to
a 'Train' compute node (running CentOS 8) fails with:
-----------------------------------------------------------------------
[...]
_compare_cpu /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8559
2021-01-26 23:30:25.169 7 ERROR nova.virt.libvirt.driver [req-774be110-7fb6-4865-a177-d624a821cf9e 19ec0130b8714aac8c64a5c2ee5b914b 352675f5f34d45d59bdd61fde58e4bd0 - default default] CPU doesn't have compatibility.
0
Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server [req-774be110-7fb6-4865-a177-d624a821cf9e 19ec0130b8714aac8c64a5c2ee5b914b 352675f5f34d45d59bdd61fde58e4bd0 - default default] Exception during message handling: nova.exception.InvalidCPUInfo: Unacceptable CPU info: CPU doesn't have compatibility.
[...]
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server block_migration, disk_over_commit)
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 8258, in check_can_live_migrate_destination
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server self._compare_cpu(None, source_cpu_info, instance)
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 8575, in _compare_cpu
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server raise exception.InvalidCPUInfo(reason=m % {'ret': ret, 'u': u})
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server nova.exception.InvalidCPUInfo: Unacceptable CPU info: CPU doesn't have compatibility.
2021-01-26 23:30:25.242 7 ERROR oslo_messaging.rpc.server
[...]
-----------------------------------------------------------------------
Environment
-----------
The bug was reported by testing in a nested KVM environment, running on
Intel hardware (Xeon(R) Gold 5218R CPU @ 2.10GHz), with the entire
OpenStack setup in VMs. So the Nova instances themselves will be nested
guests.
- Source: a CentOS-7 compute node (a level-1 guest), running OpenStack
'Queens'
- Destination: a CentOS-8 compute node (a level-1 guest), running
OpenStack 'Train'
Steps to reproduce
------------------
Live-migrate a guest from source to host.
Expected result
---------------
Live migration should've succeeded.
Actual result
-------------
Live migration fails during compareCPU() check on the destination host
with:
[...]
_compare_cpu /usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py:8559
2021-01-26 23:30:25.169 7 ERROR nova.virt.libvirt.driver [req-774be110-7fb6-4865-a177-d624a821cf9e 19ec0130b8714aac8c64a5c2ee5b914b 352675f5f34d45d59bdd61fde58e4bd0 - default default] CPU doesn't have compatibility.
[...]
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1913716/+subscriptions
References