yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89362
[Bug 1982497] [NEW] CPUUnpinningUnknown exception thrown after failed Live Migration for instance with dedicated CPUs
Public bug reported:
The instance cannot be deleted after a failed live migration as delete
fails with nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3]
must be a subset of pinned CPU set [0, 1]
Steps to reproduce
------------------
1) create a multinode devstack with dedicated_cpu_set configured asymmetrically. host_a 0,1 host_b 2, 3
2) boot an instance on host_a with two dedicated CPUs. It will occupy 0,1
3) break live migration, i.e prevent the host_a to communicate with host_b
4) live migrate the instance. Nova will claim CPU 2, 3 on host_b
5) observer that the live migration failed and rolled back. The instance is running on host_a
6) try to delete the instance. It will fail as nova try to unpin CPU 2, 3 instead of CPU 0, 1 on host_a
2022-07-21 15:35:32,229 ERROR [nova.compute.manager] Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/build-bionic/nova/compute/manager.py", line 3060, in do_terminate_instance
self._delete_instance(context, instance, bdms)
File "/build-bionic/nova/compute/manager.py", line 3024, in _delete_instance
self._complete_deletion(context, instance)
File "/build-bionic/nova/compute/manager.py"
, line 828, in _complete_deletion
self._update_resource_tracker(context, instance)
File "/build-bionic/nova/compute/manager.py", line 596, in _update_resource_tracker
self.rt.update_usage(context, instance, instance.node)
File "/build-bionic/.tox/functional-py38/lib/python3.8/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
return f(*args, **kwargs)
File "/build-bionic/nova/compute/resource_tracker.py", line 656, in update_usage
self._update_usage_from_instance(context, instance, nodename)
File "/build-bionic/nova/compute/resource_tracker.py", line 1491, in _update_usage_from_instance
self._update_usage(self._get_usage_dict(instance, instance),
File "/build-bionic/nova/compute/resource_tracker.py", line 1295, in _update_usage
cn.numa_topology = hardware.numa_usage_from_instance_numa(
File "/build-bionic/nova/virt/hardware.py", line 2374, in numa_usage_from_instance_numa
new_cell.unpin_cpus(pinned_cpus)
File "/build-bionic/nova/objects/numa.py", line 106, in unpin_cpus
raise exception.CPUUnpinningInvalid(requested=list(cpus),
nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3] must be a subset of pinned CPU set [0, 1]
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1982497
Title:
CPUUnpinningUnknown exception thrown after failed Live Migration for
instance with dedicated CPUs
Status in OpenStack Compute (nova):
New
Bug description:
The instance cannot be deleted after a failed live migration as delete
fails with nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3]
must be a subset of pinned CPU set [0, 1]
Steps to reproduce
------------------
1) create a multinode devstack with dedicated_cpu_set configured asymmetrically. host_a 0,1 host_b 2, 3
2) boot an instance on host_a with two dedicated CPUs. It will occupy 0,1
3) break live migration, i.e prevent the host_a to communicate with host_b
4) live migrate the instance. Nova will claim CPU 2, 3 on host_b
5) observer that the live migration failed and rolled back. The instance is running on host_a
6) try to delete the instance. It will fail as nova try to unpin CPU 2, 3 instead of CPU 0, 1 on host_a
2022-07-21 15:35:32,229 ERROR [nova.compute.manager] Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/build-bionic/nova/compute/manager.py", line 3060, in do_terminate_instance
self._delete_instance(context, instance, bdms)
File "/build-bionic/nova/compute/manager.py", line 3024, in _delete_instance
self._complete_deletion(context, instance)
File "/build-bionic/nova/compute/manager.py"
, line 828, in _complete_deletion
self._update_resource_tracker(context, instance)
File "/build-bionic/nova/compute/manager.py", line 596, in _update_resource_tracker
self.rt.update_usage(context, instance, instance.node)
File "/build-bionic/.tox/functional-py38/lib/python3.8/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
return f(*args, **kwargs)
File "/build-bionic/nova/compute/resource_tracker.py", line 656, in update_usage
self._update_usage_from_instance(context, instance, nodename)
File "/build-bionic/nova/compute/resource_tracker.py", line 1491, in _update_usage_from_instance
self._update_usage(self._get_usage_dict(instance, instance),
File "/build-bionic/nova/compute/resource_tracker.py", line 1295, in _update_usage
cn.numa_topology = hardware.numa_usage_from_instance_numa(
File "/build-bionic/nova/virt/hardware.py", line 2374, in numa_usage_from_instance_numa
new_cell.unpin_cpus(pinned_cpus)
File "/build-bionic/nova/objects/numa.py", line 106, in unpin_cpus
raise exception.CPUUnpinningInvalid(requested=list(cpus),
nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3] must be a subset of pinned CPU set [0, 1]
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1982497/+subscriptions