← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2025480] [NEW] overlapping pinned CPUs after unshelve

 

Public bug reported:

It seems that after unshelve, occasionally the request for a dedicated
CPU is ignored. More precisely the first pinned CPU does not seem to be
marked as consumed, so the second may end up on the same CPU. This was
first observed on victoria (6 times out of 46 tries), but then I was
able to reproduce it on master too (6 times out of 20 tries). The logs
attached are from the victoria environment, which was a single-host all-
in-one devstack running only the vms used for this reprouduction.

stable/victoria
devstack 3eb6e2d7
nova 1aca09b966

master
devstack b10c0602
nova 2aea80c0af

config:
[[post-config|/etc/nova/nova.conf]]
[DEFAULT]
scheduler_default_filters = NUMATopologyFilter, ...
[compute]
cpu_dedicated_set = 0,1

Confirming this config in placement:
$ openstack --os-placement-api-version 1.17 resource provide inventory show 46b3d4de-bb45-4607-8860-040eb2dcd0d7 PCPU
+------------------+-------+
| Field            | Value |
+------------------+-------+
| allocation_ratio | 1.0   |
| min_unit         | 1     |
| max_unit         | 2     |
| reserved         | 0     |
| step_size        | 1     |
| total            | 2     |
+------------------+-------+

Reproduction steps:
openstack flavor create cirros256-pinned --public --vcpus 1 --ram 256 --disk 1 --property hw_rng:allowed=True --property hw:cpu_policy=dedicated

openstack server list -f value -c ID | xargs -r openstack server delete
--wait

openstack server create --flavor cirros256-pinned --image cirros-0.5.1-x86_64-disk --nic net-id=private vm0 --wait
openstack server shelve vm0
sleep 10 # make sure shelve finished
openstack server create --flavor cirros256-pinned --image cirros-0.5.1-x86_64-disk --nic net-id=private vm1 --wait
openstack server shelve vm1
sleep 10

openstack server unshelve vm0 ; sleep 15 ; openstack server unshelve vm1 # the amount of sleep could easily be relevant
watch openstack server list # wait until both go ACTIVE

# both vms ended up on the same cpu
$ for vm in $( sudo virsh list --name ) ; do sudo virsh dumpxml $vm | xmlstarlet sel -t -v '//vcpupin/@cpuset' ; echo ; done
0
0

Data collected from the environment where the above reproduction
triggered the bug:

$ openstack server list
+--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+
| ID                                   | Name | Status | Networks                                               | Image                    | Flavor           |
+--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+
| 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | vm1  | ACTIVE | private=10.0.0.27, fdfb:ab27:b2b2:0:f816:3eff:fe80:2fd | cirros-0.5.1-x86_64-disk | cirros256-pinned |
| e30de509-6988-4535-a6f5-520c52fba087 | vm0  | ACTIVE | private=10.0.0.6, fdfb:ab27:b2b2:0:f816:3eff:fe78:d368 | cirros-0.5.1-x86_64-disk | cirros256-pinned |
+--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+

$ openstack server show vm0
+-------------------------------------+-----------------------------------------------------------------+
| Field                               | Value                                                           |
+-------------------------------------+-----------------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                          |
| OS-EXT-AZ:availability_zone         | nova                                                            |
| OS-EXT-SRV-ATTR:host                | devstack1v                                                      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | devstack1v                                                      |
| OS-EXT-SRV-ATTR:instance_name       | instance-0000001f                                               |
| OS-EXT-STS:power_state              | Running                                                         |
| OS-EXT-STS:task_state               | None                                                            |
| OS-EXT-STS:vm_state                 | active                                                          |
| OS-SRV-USG:launched_at              | 2023-06-29T10:45:25.000000                                      |
| OS-SRV-USG:terminated_at            | None                                                            |
| accessIPv4                          |                                                                 |
| accessIPv6                          |                                                                 |
| addresses                           | private=10.0.0.6, fdfb:ab27:b2b2:0:f816:3eff:fe78:d368          |
| config_drive                        |                                                                 |
| created                             | 2023-06-29T10:44:42Z                                            |
| flavor                              | cirros256-pinned (a61fac7f-9d3a-4788-8dd8-b84534ed5ba6)         |
| hostId                              | a660a4498447edab1528a7c92f7eaccd7aad2c258c896f7820aef9f6        |
| id                                  | e30de509-6988-4535-a6f5-520c52fba087                            |
| image                               | cirros-0.5.1-x86_64-disk (aaefde72-6da8-45e3-bc58-d7d1b9333d0e) |
| key_name                            | None                                                            |
| name                                | vm0                                                             |
| progress                            | 0                                                               |
| project_id                          | 0ad9573f6c57482bbd8202ce43706ca6                                |
| properties                          |                                                                 |
| security_groups                     | name='default'                                                  |
| status                              | ACTIVE                                                          |
| updated                             | 2023-06-29T10:45:26Z                                            |
| user_id                             | 971105d66de542588ddccdef15594fab                                |
| volumes_attached                    |                                                                 |
+-------------------------------------+-----------------------------------------------------------------+

$ openstack server show vm1
+-------------------------------------+-----------------------------------------------------------------+
| Field                               | Value                                                           |
+-------------------------------------+-----------------------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                                          |
| OS-EXT-AZ:availability_zone         | nova                                                            |
| OS-EXT-SRV-ATTR:host                | devstack1v                                                      |
| OS-EXT-SRV-ATTR:hypervisor_hostname | devstack1v                                                      |
| OS-EXT-SRV-ATTR:instance_name       | instance-00000020                                               |
| OS-EXT-STS:power_state              | Running                                                         |
| OS-EXT-STS:task_state               | None                                                            |
| OS-EXT-STS:vm_state                 | active                                                          |
| OS-SRV-USG:launched_at              | 2023-06-29T10:45:45.000000                                      |
| OS-SRV-USG:terminated_at            | None                                                            |
| accessIPv4                          |                                                                 |
| accessIPv6                          |                                                                 |
| addresses                           | private=10.0.0.27, fdfb:ab27:b2b2:0:f816:3eff:fe80:2fd          |
| config_drive                        |                                                                 |
| created                             | 2023-06-29T10:45:00Z                                            |
| flavor                              | cirros256-pinned (a61fac7f-9d3a-4788-8dd8-b84534ed5ba6)         |
| hostId                              | a660a4498447edab1528a7c92f7eaccd7aad2c258c896f7820aef9f6        |
| id                                  | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc                            |
| image                               | cirros-0.5.1-x86_64-disk (aaefde72-6da8-45e3-bc58-d7d1b9333d0e) |
| key_name                            | None                                                            |
| name                                | vm1                                                             |
| progress                            | 0                                                               |
| project_id                          | 0ad9573f6c57482bbd8202ce43706ca6                                |
| properties                          |                                                                 |
| security_groups                     | name='default'                                                  |
| status                              | ACTIVE                                                          |
| updated                             | 2023-06-29T10:45:45Z                                            |
| user_id                             | 971105d66de542588ddccdef15594fab                                |
| volumes_attached                    |                                                                 |
+-------------------------------------+-----------------------------------------------------------------+

$ openstack flavor show cirros256-pinned
+----------------------------+--------------------------------------------------+
| Field                      | Value                                            |
+----------------------------+--------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                            |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                |
| access_project_ids         | None                                             |
| disk                       | 1                                                |
| id                         | a61fac7f-9d3a-4788-8dd8-b84534ed5ba6             |
| name                       | cirros256-pinned                                 |
| os-flavor-access:is_public | True                                             |
| properties                 | hw:cpu_policy='dedicated', hw_rng:allowed='True' |
| ram                        | 256                                              |
| rxtx_factor                | 1.0                                              |
| swap                       |                                                  |
| vcpus                      | 1                                                |
+----------------------------+--------------------------------------------------+

$ openstack server event list vm0
+------------------------------------------+--------------------------------------+----------+----------------------------+
| Request ID                               | Server ID                            | Action   | Start Time                 |
+------------------------------------------+--------------------------------------+----------+----------------------------+
| req-e09149a8-6ef9-49c3-a757-57bbc14d6004 | e30de509-6988-4535-a6f5-520c52fba087 | unshelve | 2023-06-29T10:45:19.000000 |
| req-4628f975-a57b-468e-a41c-0ef405df4bae | e30de509-6988-4535-a6f5-520c52fba087 | shelve   | 2023-06-29T10:44:49.000000 |
| req-d7595149-3cc6-472c-9da8-7a40c020a620 | e30de509-6988-4535-a6f5-520c52fba087 | create   | 2023-06-29T10:44:42.000000 |
+------------------------------------------+--------------------------------------+----------+----------------------------+
$ openstack server event show vm0 req-e09149a8-6ef9-49c3-a757-57bbc14d6004
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field         | Value                                                                                                                                                                                                                      |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| action        | unshelve                                                                                                                                                                                                                   |
| events        | [{'event': 'compute_unshelve_instance', 'start_time': '2023-06-29T10:45:19.000000', 'finish_time': '2023-06-29T10:45:26.000000', 'result': 'Success', 'traceback': None}, {'event': 'schedule_instances', 'start_time':    |
|               | '2023-06-29T10:45:19.000000', 'finish_time': '2023-06-29T10:45:19.000000', 'result': 'Success', 'traceback': None}, {'event': 'get_image_info', 'start_time': '2023-06-29T10:45:19.000000', 'finish_time':                 |
|               | '2023-06-29T10:45:19.000000', 'result': 'Success', 'traceback': None}]                                                                                                                                                     |
| instance_uuid | e30de509-6988-4535-a6f5-520c52fba087                                                                                                                                                                                       |
| message       | None                                                                                                                                                                                                                       |
| project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                                                                           |
| request_id    | req-e09149a8-6ef9-49c3-a757-57bbc14d6004                                                                                                                                                                                   |
| start_time    | 2023-06-29T10:45:19.000000                                                                                                                                                                                                 |
| user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                                                                           |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
$ openstack server event show vm0 req-4628f975-a57b-468e-a41c-0ef405df4bae
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field         | Value                                                                                                                                                                   |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| action        | shelve                                                                                                                                                                  |
| events        | [{'event': 'compute_shelve_instance', 'start_time': '2023-06-29T10:44:49.000000', 'finish_time': '2023-06-29T10:44:55.000000', 'result': 'Success', 'traceback': None}] |
| instance_uuid | e30de509-6988-4535-a6f5-520c52fba087                                                                                                                                    |
| message       | None                                                                                                                                                                    |
| project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                        |
| request_id    | req-4628f975-a57b-468e-a41c-0ef405df4bae                                                                                                                                |
| start_time    | 2023-06-29T10:44:49.000000                                                                                                                                              |
| user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                        |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

$ openstack server event list vm1
+------------------------------------------+--------------------------------------+----------+----------------------------+
| Request ID                               | Server ID                            | Action   | Start Time                 |
+------------------------------------------+--------------------------------------+----------+----------------------------+
| req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | unshelve | 2023-06-29T10:45:35.000000 |
| req-1dc9b980-d765-415a-b946-2daf53f361b1 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | shelve   | 2023-06-29T10:45:07.000000 |
| req-5071fd9f-00e2-44f8-9b6c-8715fe918e98 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | create   | 2023-06-29T10:45:00.000000 |
+------------------------------------------+--------------------------------------+----------+----------------------------+
$ openstack server event show vm1 req-9ab0eabc-bf97-48bf-b534-dd22df3c6452
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field         | Value                                                                                                                                                                                                                      |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| action        | unshelve                                                                                                                                                                                                                   |
| events        | [{'event': 'compute_unshelve_instance', 'start_time': '2023-06-29T10:45:36.000000', 'finish_time': '2023-06-29T10:45:45.000000', 'result': 'Success', 'traceback': None}, {'event': 'schedule_instances', 'start_time':    |
|               | '2023-06-29T10:45:35.000000', 'finish_time': '2023-06-29T10:45:36.000000', 'result': 'Success', 'traceback': None}, {'event': 'get_image_info', 'start_time': '2023-06-29T10:45:35.000000', 'finish_time':                 |
|               | '2023-06-29T10:45:35.000000', 'result': 'Success', 'traceback': None}]                                                                                                                                                     |
| instance_uuid | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc                                                                                                                                                                                       |
| message       | None                                                                                                                                                                                                                       |
| project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                                                                           |
| request_id    | req-9ab0eabc-bf97-48bf-b534-dd22df3c6452                                                                                                                                                                                   |
| start_time    | 2023-06-29T10:45:35.000000                                                                                                                                                                                                 |
| user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                                                                           |
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
$ openstack server event show vm1 req-1dc9b980-d765-415a-b946-2daf53f361b1
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field         | Value                                                                                                                                                                   |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| action        | shelve                                                                                                                                                                  |
| events        | [{'event': 'compute_shelve_instance', 'start_time': '2023-06-29T10:45:08.000000', 'finish_time': '2023-06-29T10:45:13.000000', 'result': 'Success', 'traceback': None}] |
| instance_uuid | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc                                                                                                                                    |
| message       | None                                                                                                                                                                    |
| project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                        |
| request_id    | req-1dc9b980-d765-415a-b946-2daf53f361b1                                                                                                                                |
| start_time    | 2023-06-29T10:45:07.000000                                                                                                                                              |
| user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                        |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Logs that seemed relevant to me:
vm0 unshelve:
jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=7957,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([0,1]),pinned_cpus=set([]),siblings=[set([0]),set([1])]) {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1046}}
jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Pinning has been requested {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1133}}
jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{0}, {1}]    instance_cell: InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:731}}
jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Built sibling_sets: defaultdict(<class 'list'>, {1: [{0}, {1}]}) {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:752}}
jún 29 10:45:19 devstack1v nova-compute[379150]: INFO nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1]], vCPUs mapping: [(0, 0)]
jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Selected cores for pinning: [(0, 0)], in cell 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:1013}}
jún 29 10:45:19 devstack1v nova-compute[379150]: INFO nova.compute.claims [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] [instance: e30de509-6988-4535-a6f5-520c52fba087] Claim successful on node devstack1v

vm1 unshelve:
jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=7957,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([0,1]),pinned_cpus=set([]),siblings=[set([0]),set([1])]) {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1046}}
jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Pinning has been requested {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1133}}
jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{0}, {1}]    instance_cell: InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:731}}
jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Built sibling_sets: defaultdict(<class 'list'>, {1: [{0}, {1}]}) {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:752}}
jún 29 10:45:37 devstack1v nova-compute[379150]: INFO nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1]], vCPUs mapping: [(0, 0)]
jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Selected cores for pinning: [(0, 0)], in cell 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:1013}}
jún 29 10:45:37 devstack1v nova-compute[379150]: INFO nova.compute.claims [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] [instance: 4734b8a5-a6dd-432a-86c9-ba0367bb86cc] Claim successful on node devstack1v

The first error in the logs came not during unshelve but well after:
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Error updating resources for node devstack1v.: nova.exception.CPUPinningInvalid: CPU set to pin [0] must
 be a subset of free CPU set [1]
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager Traceback (most recent call last):
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", line 9973, in _update_available_resource_for_node
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self.rt.update_available_resource(context, nodename,
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 896, in update_available_resource
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py", line 360, in inner
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     return f(*args, **kwargs)
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 968, in _update_available_resource
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     instance_by_uuid = self._update_usage_from_instances(
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1534, in _update_usage_from_instances
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self._update_usage_from_instance(context, instance, nodename)
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1497, in _update_usage_from_instance
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self._update_usage(self._get_usage_dict(instance, instance),
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1301, in _update_usage
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     cn.numa_topology = hardware.numa_usage_from_instance_numa(
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/hardware.py", line 2450, in numa_usage_from_instance_numa
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     new_cell.pin_cpus(pinned_cpus)
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/objects/numa.py", line 91, in pin_cpus
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     raise exception.CPUPinningInvalid(requested=list(cpus),
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager nova.exception.CPUPinningInvalid: CPU set to pin [0] must be a subset of free CPU set [1]
jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager
jún 29 10:47:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager._instance_usage_audit {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}

Between the two unshelves the update_available_resource task ran once:
$ egrep 'Running periodic task ComputeManager.update_available_resource' ~/n-cpu.log
jún 29 10:35:16 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:36:17 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:37:17 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:38:19 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:39:20 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:40:22 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:41:22 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:42:23 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:43:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:44:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:45:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:46:26 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:47:27 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:48:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
jún 29 10:49:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}

I also attached a .tar.gz with the following contents:

for service in n-api n-cond-cell1 n-cpu n-sch n-super-cond
do
    sudo journalctl -u devstack@$service -S '2023-06-29 10:35:00' -U '2023-06-29 10:50:00' > ~/$service.log
done

$ sudo virsh list
 Id   Name                State
-----------------------------------
 63   instance-0000001f   running
 64   instance-00000020   running

sudo virsh dumpxml instance-0000001f > ~/instance-0000001f.xml
sudo virsh dumpxml instance-00000020 > ~/instance-00000020.xml

Let me know if I can collect more information.

** Affects: nova
     Importance: Undecided
         Status: New

** Attachment added: "unshelve-overlapping-pinned-cpus.tar.gz"
   https://bugs.launchpad.net/bugs/2025480/+attachment/5683100/+files/unshelve-overlapping-pinned-cpus.tar.gz

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2025480

Title:
  overlapping pinned CPUs after unshelve

Status in OpenStack Compute (nova):
  New

Bug description:
  It seems that after unshelve, occasionally the request for a dedicated
  CPU is ignored. More precisely the first pinned CPU does not seem to
  be marked as consumed, so the second may end up on the same CPU. This
  was first observed on victoria (6 times out of 46 tries), but then I
  was able to reproduce it on master too (6 times out of 20 tries). The
  logs attached are from the victoria environment, which was a single-
  host all-in-one devstack running only the vms used for this
  reprouduction.

  stable/victoria
  devstack 3eb6e2d7
  nova 1aca09b966

  master
  devstack b10c0602
  nova 2aea80c0af

  config:
  [[post-config|/etc/nova/nova.conf]]
  [DEFAULT]
  scheduler_default_filters = NUMATopologyFilter, ...
  [compute]
  cpu_dedicated_set = 0,1

  Confirming this config in placement:
  $ openstack --os-placement-api-version 1.17 resource provide inventory show 46b3d4de-bb45-4607-8860-040eb2dcd0d7 PCPU
  +------------------+-------+
  | Field            | Value |
  +------------------+-------+
  | allocation_ratio | 1.0   |
  | min_unit         | 1     |
  | max_unit         | 2     |
  | reserved         | 0     |
  | step_size        | 1     |
  | total            | 2     |
  +------------------+-------+

  Reproduction steps:
  openstack flavor create cirros256-pinned --public --vcpus 1 --ram 256 --disk 1 --property hw_rng:allowed=True --property hw:cpu_policy=dedicated

  openstack server list -f value -c ID | xargs -r openstack server
  delete --wait

  openstack server create --flavor cirros256-pinned --image cirros-0.5.1-x86_64-disk --nic net-id=private vm0 --wait
  openstack server shelve vm0
  sleep 10 # make sure shelve finished
  openstack server create --flavor cirros256-pinned --image cirros-0.5.1-x86_64-disk --nic net-id=private vm1 --wait
  openstack server shelve vm1
  sleep 10

  openstack server unshelve vm0 ; sleep 15 ; openstack server unshelve vm1 # the amount of sleep could easily be relevant
  watch openstack server list # wait until both go ACTIVE

  # both vms ended up on the same cpu
  $ for vm in $( sudo virsh list --name ) ; do sudo virsh dumpxml $vm | xmlstarlet sel -t -v '//vcpupin/@cpuset' ; echo ; done
  0
  0

  Data collected from the environment where the above reproduction
  triggered the bug:

  $ openstack server list
  +--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+
  | ID                                   | Name | Status | Networks                                               | Image                    | Flavor           |
  +--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+
  | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | vm1  | ACTIVE | private=10.0.0.27, fdfb:ab27:b2b2:0:f816:3eff:fe80:2fd | cirros-0.5.1-x86_64-disk | cirros256-pinned |
  | e30de509-6988-4535-a6f5-520c52fba087 | vm0  | ACTIVE | private=10.0.0.6, fdfb:ab27:b2b2:0:f816:3eff:fe78:d368 | cirros-0.5.1-x86_64-disk | cirros256-pinned |
  +--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+

  $ openstack server show vm0
  +-------------------------------------+-----------------------------------------------------------------+
  | Field                               | Value                                                           |
  +-------------------------------------+-----------------------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                                          |
  | OS-EXT-AZ:availability_zone         | nova                                                            |
  | OS-EXT-SRV-ATTR:host                | devstack1v                                                      |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | devstack1v                                                      |
  | OS-EXT-SRV-ATTR:instance_name       | instance-0000001f                                               |
  | OS-EXT-STS:power_state              | Running                                                         |
  | OS-EXT-STS:task_state               | None                                                            |
  | OS-EXT-STS:vm_state                 | active                                                          |
  | OS-SRV-USG:launched_at              | 2023-06-29T10:45:25.000000                                      |
  | OS-SRV-USG:terminated_at            | None                                                            |
  | accessIPv4                          |                                                                 |
  | accessIPv6                          |                                                                 |
  | addresses                           | private=10.0.0.6, fdfb:ab27:b2b2:0:f816:3eff:fe78:d368          |
  | config_drive                        |                                                                 |
  | created                             | 2023-06-29T10:44:42Z                                            |
  | flavor                              | cirros256-pinned (a61fac7f-9d3a-4788-8dd8-b84534ed5ba6)         |
  | hostId                              | a660a4498447edab1528a7c92f7eaccd7aad2c258c896f7820aef9f6        |
  | id                                  | e30de509-6988-4535-a6f5-520c52fba087                            |
  | image                               | cirros-0.5.1-x86_64-disk (aaefde72-6da8-45e3-bc58-d7d1b9333d0e) |
  | key_name                            | None                                                            |
  | name                                | vm0                                                             |
  | progress                            | 0                                                               |
  | project_id                          | 0ad9573f6c57482bbd8202ce43706ca6                                |
  | properties                          |                                                                 |
  | security_groups                     | name='default'                                                  |
  | status                              | ACTIVE                                                          |
  | updated                             | 2023-06-29T10:45:26Z                                            |
  | user_id                             | 971105d66de542588ddccdef15594fab                                |
  | volumes_attached                    |                                                                 |
  +-------------------------------------+-----------------------------------------------------------------+

  $ openstack server show vm1
  +-------------------------------------+-----------------------------------------------------------------+
  | Field                               | Value                                                           |
  +-------------------------------------+-----------------------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                                          |
  | OS-EXT-AZ:availability_zone         | nova                                                            |
  | OS-EXT-SRV-ATTR:host                | devstack1v                                                      |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | devstack1v                                                      |
  | OS-EXT-SRV-ATTR:instance_name       | instance-00000020                                               |
  | OS-EXT-STS:power_state              | Running                                                         |
  | OS-EXT-STS:task_state               | None                                                            |
  | OS-EXT-STS:vm_state                 | active                                                          |
  | OS-SRV-USG:launched_at              | 2023-06-29T10:45:45.000000                                      |
  | OS-SRV-USG:terminated_at            | None                                                            |
  | accessIPv4                          |                                                                 |
  | accessIPv6                          |                                                                 |
  | addresses                           | private=10.0.0.27, fdfb:ab27:b2b2:0:f816:3eff:fe80:2fd          |
  | config_drive                        |                                                                 |
  | created                             | 2023-06-29T10:45:00Z                                            |
  | flavor                              | cirros256-pinned (a61fac7f-9d3a-4788-8dd8-b84534ed5ba6)         |
  | hostId                              | a660a4498447edab1528a7c92f7eaccd7aad2c258c896f7820aef9f6        |
  | id                                  | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc                            |
  | image                               | cirros-0.5.1-x86_64-disk (aaefde72-6da8-45e3-bc58-d7d1b9333d0e) |
  | key_name                            | None                                                            |
  | name                                | vm1                                                             |
  | progress                            | 0                                                               |
  | project_id                          | 0ad9573f6c57482bbd8202ce43706ca6                                |
  | properties                          |                                                                 |
  | security_groups                     | name='default'                                                  |
  | status                              | ACTIVE                                                          |
  | updated                             | 2023-06-29T10:45:45Z                                            |
  | user_id                             | 971105d66de542588ddccdef15594fab                                |
  | volumes_attached                    |                                                                 |
  +-------------------------------------+-----------------------------------------------------------------+

  $ openstack flavor show cirros256-pinned
  +----------------------------+--------------------------------------------------+
  | Field                      | Value                                            |
  +----------------------------+--------------------------------------------------+
  | OS-FLV-DISABLED:disabled   | False                                            |
  | OS-FLV-EXT-DATA:ephemeral  | 0                                                |
  | access_project_ids         | None                                             |
  | disk                       | 1                                                |
  | id                         | a61fac7f-9d3a-4788-8dd8-b84534ed5ba6             |
  | name                       | cirros256-pinned                                 |
  | os-flavor-access:is_public | True                                             |
  | properties                 | hw:cpu_policy='dedicated', hw_rng:allowed='True' |
  | ram                        | 256                                              |
  | rxtx_factor                | 1.0                                              |
  | swap                       |                                                  |
  | vcpus                      | 1                                                |
  +----------------------------+--------------------------------------------------+

  $ openstack server event list vm0
  +------------------------------------------+--------------------------------------+----------+----------------------------+
  | Request ID                               | Server ID                            | Action   | Start Time                 |
  +------------------------------------------+--------------------------------------+----------+----------------------------+
  | req-e09149a8-6ef9-49c3-a757-57bbc14d6004 | e30de509-6988-4535-a6f5-520c52fba087 | unshelve | 2023-06-29T10:45:19.000000 |
  | req-4628f975-a57b-468e-a41c-0ef405df4bae | e30de509-6988-4535-a6f5-520c52fba087 | shelve   | 2023-06-29T10:44:49.000000 |
  | req-d7595149-3cc6-472c-9da8-7a40c020a620 | e30de509-6988-4535-a6f5-520c52fba087 | create   | 2023-06-29T10:44:42.000000 |
  +------------------------------------------+--------------------------------------+----------+----------------------------+
  $ openstack server event show vm0 req-e09149a8-6ef9-49c3-a757-57bbc14d6004
  +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field         | Value                                                                                                                                                                                                                      |
  +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | action        | unshelve                                                                                                                                                                                                                   |
  | events        | [{'event': 'compute_unshelve_instance', 'start_time': '2023-06-29T10:45:19.000000', 'finish_time': '2023-06-29T10:45:26.000000', 'result': 'Success', 'traceback': None}, {'event': 'schedule_instances', 'start_time':    |
  |               | '2023-06-29T10:45:19.000000', 'finish_time': '2023-06-29T10:45:19.000000', 'result': 'Success', 'traceback': None}, {'event': 'get_image_info', 'start_time': '2023-06-29T10:45:19.000000', 'finish_time':                 |
  |               | '2023-06-29T10:45:19.000000', 'result': 'Success', 'traceback': None}]                                                                                                                                                     |
  | instance_uuid | e30de509-6988-4535-a6f5-520c52fba087                                                                                                                                                                                       |
  | message       | None                                                                                                                                                                                                                       |
  | project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                                                                           |
  | request_id    | req-e09149a8-6ef9-49c3-a757-57bbc14d6004                                                                                                                                                                                   |
  | start_time    | 2023-06-29T10:45:19.000000                                                                                                                                                                                                 |
  | user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                                                                           |
  +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  $ openstack server event show vm0 req-4628f975-a57b-468e-a41c-0ef405df4bae
  +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field         | Value                                                                                                                                                                   |
  +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | action        | shelve                                                                                                                                                                  |
  | events        | [{'event': 'compute_shelve_instance', 'start_time': '2023-06-29T10:44:49.000000', 'finish_time': '2023-06-29T10:44:55.000000', 'result': 'Success', 'traceback': None}] |
  | instance_uuid | e30de509-6988-4535-a6f5-520c52fba087                                                                                                                                    |
  | message       | None                                                                                                                                                                    |
  | project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                        |
  | request_id    | req-4628f975-a57b-468e-a41c-0ef405df4bae                                                                                                                                |
  | start_time    | 2023-06-29T10:44:49.000000                                                                                                                                              |
  | user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                        |
  +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

  $ openstack server event list vm1
  +------------------------------------------+--------------------------------------+----------+----------------------------+
  | Request ID                               | Server ID                            | Action   | Start Time                 |
  +------------------------------------------+--------------------------------------+----------+----------------------------+
  | req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | unshelve | 2023-06-29T10:45:35.000000 |
  | req-1dc9b980-d765-415a-b946-2daf53f361b1 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | shelve   | 2023-06-29T10:45:07.000000 |
  | req-5071fd9f-00e2-44f8-9b6c-8715fe918e98 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | create   | 2023-06-29T10:45:00.000000 |
  +------------------------------------------+--------------------------------------+----------+----------------------------+
  $ openstack server event show vm1 req-9ab0eabc-bf97-48bf-b534-dd22df3c6452
  +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field         | Value                                                                                                                                                                                                                      |
  +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | action        | unshelve                                                                                                                                                                                                                   |
  | events        | [{'event': 'compute_unshelve_instance', 'start_time': '2023-06-29T10:45:36.000000', 'finish_time': '2023-06-29T10:45:45.000000', 'result': 'Success', 'traceback': None}, {'event': 'schedule_instances', 'start_time':    |
  |               | '2023-06-29T10:45:35.000000', 'finish_time': '2023-06-29T10:45:36.000000', 'result': 'Success', 'traceback': None}, {'event': 'get_image_info', 'start_time': '2023-06-29T10:45:35.000000', 'finish_time':                 |
  |               | '2023-06-29T10:45:35.000000', 'result': 'Success', 'traceback': None}]                                                                                                                                                     |
  | instance_uuid | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc                                                                                                                                                                                       |
  | message       | None                                                                                                                                                                                                                       |
  | project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                                                                           |
  | request_id    | req-9ab0eabc-bf97-48bf-b534-dd22df3c6452                                                                                                                                                                                   |
  | start_time    | 2023-06-29T10:45:35.000000                                                                                                                                                                                                 |
  | user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                                                                           |
  +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  $ openstack server event show vm1 req-1dc9b980-d765-415a-b946-2daf53f361b1
  +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field         | Value                                                                                                                                                                   |
  +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  | action        | shelve                                                                                                                                                                  |
  | events        | [{'event': 'compute_shelve_instance', 'start_time': '2023-06-29T10:45:08.000000', 'finish_time': '2023-06-29T10:45:13.000000', 'result': 'Success', 'traceback': None}] |
  | instance_uuid | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc                                                                                                                                    |
  | message       | None                                                                                                                                                                    |
  | project_id    | 0ad9573f6c57482bbd8202ce43706ca6                                                                                                                                        |
  | request_id    | req-1dc9b980-d765-415a-b946-2daf53f361b1                                                                                                                                |
  | start_time    | 2023-06-29T10:45:07.000000                                                                                                                                              |
  | user_id       | 971105d66de542588ddccdef15594fab                                                                                                                                        |
  +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

  Logs that seemed relevant to me:
  vm0 unshelve:
  jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=7957,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([0,1]),pinned_cpus=set([]),siblings=[set([0]),set([1])]) {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1046}}
  jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Pinning has been requested {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1133}}
  jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{0}, {1}]    instance_cell: InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:731}}
  jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Built sibling_sets: defaultdict(<class 'list'>, {1: [{0}, {1}]}) {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:752}}
  jún 29 10:45:19 devstack1v nova-compute[379150]: INFO nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1]], vCPUs mapping: [(0, 0)]
  jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Selected cores for pinning: [(0, 0)], in cell 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:1013}}
  jún 29 10:45:19 devstack1v nova-compute[379150]: INFO nova.compute.claims [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] [instance: e30de509-6988-4535-a6f5-520c52fba087] Claim successful on node devstack1v

  vm1 unshelve:
  jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=7957,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([0,1]),pinned_cpus=set([]),siblings=[set([0]),set([1])]) {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1046}}
  jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Pinning has been requested {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1133}}
  jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{0}, {1}]    instance_cell: InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:731}}
  jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Built sibling_sets: defaultdict(<class 'list'>, {1: [{0}, {1}]}) {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:752}}
  jún 29 10:45:37 devstack1v nova-compute[379150]: INFO nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1]], vCPUs mapping: [(0, 0)]
  jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Selected cores for pinning: [(0, 0)], in cell 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:1013}}
  jún 29 10:45:37 devstack1v nova-compute[379150]: INFO nova.compute.claims [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] [instance: 4734b8a5-a6dd-432a-86c9-ba0367bb86cc] Claim successful on node devstack1v

  The first error in the logs came not during unshelve but well after:
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Error updating resources for node devstack1v.: nova.exception.CPUPinningInvalid: CPU set to pin [0] must
   be a subset of free CPU set [1]
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager Traceback (most recent call last):
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", line 9973, in _update_available_resource_for_node
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self.rt.update_available_resource(context, nodename,
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 896, in update_available_resource
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py", line 360, in inner
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     return f(*args, **kwargs)
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 968, in _update_available_resource
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     instance_by_uuid = self._update_usage_from_instances(
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1534, in _update_usage_from_instances
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self._update_usage_from_instance(context, instance, nodename)
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1497, in _update_usage_from_instance
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     self._update_usage(self._get_usage_dict(instance, instance),
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1301, in _update_usage
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     cn.numa_topology = hardware.numa_usage_from_instance_numa(
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/hardware.py", line 2450, in numa_usage_from_instance_numa
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     new_cell.pin_cpus(pinned_cpus)
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/objects/numa.py", line 91, in pin_cpus
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager     raise exception.CPUPinningInvalid(requested=list(cpus),
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager nova.exception.CPUPinningInvalid: CPU set to pin [0] must be a subset of free CPU set [1]
  jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager
  jún 29 10:47:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager._instance_usage_audit {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}

  Between the two unshelves the update_available_resource task ran once:
  $ egrep 'Running periodic task ComputeManager.update_available_resource' ~/n-cpu.log
  jún 29 10:35:16 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:36:17 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:37:17 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:38:19 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:39:20 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:40:22 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:41:22 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:42:23 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:43:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:44:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:45:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:46:26 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:47:27 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:48:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}
  jún 29 10:49:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}}

  I also attached a .tar.gz with the following contents:

  for service in n-api n-cond-cell1 n-cpu n-sch n-super-cond
  do
      sudo journalctl -u devstack@$service -S '2023-06-29 10:35:00' -U '2023-06-29 10:50:00' > ~/$service.log
  done

  $ sudo virsh list
   Id   Name                State
  -----------------------------------
   63   instance-0000001f   running
   64   instance-00000020   running

  sudo virsh dumpxml instance-0000001f > ~/instance-0000001f.xml
  sudo virsh dumpxml instance-00000020 > ~/instance-00000020.xml

  Let me know if I can collect more information.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2025480/+subscriptions



Follow ups