← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1982497] [NEW] CPUUnpinningUnknown exception thrown after failed Live Migration for instance with dedicated CPUs

 

Public bug reported:

The instance cannot be deleted after a failed live migration as delete
fails with nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3]
must be a subset of pinned CPU set [0, 1]

Steps to reproduce
------------------
1) create a multinode devstack with dedicated_cpu_set configured asymmetrically. host_a 0,1 host_b 2, 3
2) boot an instance on host_a with two dedicated CPUs. It will occupy 0,1
3) break live migration, i.e prevent the host_a to communicate with host_b
4) live migrate the instance. Nova will claim CPU 2, 3 on host_b
5) observer that the live migration failed and rolled back. The instance is running on host_a
6) try to delete the instance. It will fail as nova try to unpin CPU 2, 3 instead of CPU 0, 1 on host_a

2022-07-21 15:35:32,229 ERROR [nova.compute.manager] Setting instance vm_state to ERROR
Traceback (most recent call last):
  File "/build-bionic/nova/compute/manager.py", line 3060, in do_terminate_instance
    self._delete_instance(context, instance, bdms)
  File "/build-bionic/nova/compute/manager.py", line 3024, in _delete_instance
    self._complete_deletion(context, instance)
  File "/build-bionic/nova/compute/manager.py"
    , line 828, in _complete_deletion
    self._update_resource_tracker(context, instance)
  File "/build-bionic/nova/compute/manager.py", line 596, in _update_resource_tracker
    self.rt.update_usage(context, instance, instance.node)
  File "/build-bionic/.tox/functional-py38/lib/python3.8/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
    return f(*args, **kwargs)
  File "/build-bionic/nova/compute/resource_tracker.py", line 656, in update_usage
    self._update_usage_from_instance(context, instance, nodename)
  File "/build-bionic/nova/compute/resource_tracker.py", line 1491, in _update_usage_from_instance
    self._update_usage(self._get_usage_dict(instance, instance),
  File "/build-bionic/nova/compute/resource_tracker.py", line 1295, in _update_usage
    cn.numa_topology = hardware.numa_usage_from_instance_numa(
  File "/build-bionic/nova/virt/hardware.py", line 2374, in numa_usage_from_instance_numa
    new_cell.unpin_cpus(pinned_cpus)
  File "/build-bionic/nova/objects/numa.py", line 106, in unpin_cpus
    raise exception.CPUUnpinningInvalid(requested=list(cpus),
nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3] must be a subset of pinned CPU set [0, 1]

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1982497

Title:
  CPUUnpinningUnknown exception thrown after failed Live Migration for
  instance with dedicated CPUs

Status in OpenStack Compute (nova):
  New

Bug description:
  The instance cannot be deleted after a failed live migration as delete
  fails with nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3]
  must be a subset of pinned CPU set [0, 1]

  Steps to reproduce
  ------------------
  1) create a multinode devstack with dedicated_cpu_set configured asymmetrically. host_a 0,1 host_b 2, 3
  2) boot an instance on host_a with two dedicated CPUs. It will occupy 0,1
  3) break live migration, i.e prevent the host_a to communicate with host_b
  4) live migrate the instance. Nova will claim CPU 2, 3 on host_b
  5) observer that the live migration failed and rolled back. The instance is running on host_a
  6) try to delete the instance. It will fail as nova try to unpin CPU 2, 3 instead of CPU 0, 1 on host_a

  2022-07-21 15:35:32,229 ERROR [nova.compute.manager] Setting instance vm_state to ERROR
  Traceback (most recent call last):
    File "/build-bionic/nova/compute/manager.py", line 3060, in do_terminate_instance
      self._delete_instance(context, instance, bdms)
    File "/build-bionic/nova/compute/manager.py", line 3024, in _delete_instance
      self._complete_deletion(context, instance)
    File "/build-bionic/nova/compute/manager.py"
      , line 828, in _complete_deletion
      self._update_resource_tracker(context, instance)
    File "/build-bionic/nova/compute/manager.py", line 596, in _update_resource_tracker
      self.rt.update_usage(context, instance, instance.node)
    File "/build-bionic/.tox/functional-py38/lib/python3.8/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
      return f(*args, **kwargs)
    File "/build-bionic/nova/compute/resource_tracker.py", line 656, in update_usage
      self._update_usage_from_instance(context, instance, nodename)
    File "/build-bionic/nova/compute/resource_tracker.py", line 1491, in _update_usage_from_instance
      self._update_usage(self._get_usage_dict(instance, instance),
    File "/build-bionic/nova/compute/resource_tracker.py", line 1295, in _update_usage
      cn.numa_topology = hardware.numa_usage_from_instance_numa(
    File "/build-bionic/nova/virt/hardware.py", line 2374, in numa_usage_from_instance_numa
      new_cell.unpin_cpus(pinned_cpus)
    File "/build-bionic/nova/objects/numa.py", line 106, in unpin_cpus
      raise exception.CPUUnpinningInvalid(requested=list(cpus),
  nova.exception.CPUUnpinningInvalid: CPU set to unpin [2, 3] must be a subset of pinned CPU set [0, 1]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1982497/+subscriptions