← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1994526] [NEW] Asymmetric multi NUMA guest with cpu_policy: mixed fails to schedule

 

Public bug reported:

Based on https://bugzilla.redhat.com/show_bug.cgi?id=2135439

Compute Host NUMA Topology:

  NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
  NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39

Hosts Dedicated/Shared Configurations:
[tripleo-admin@computesriov-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
24-39
[tripleo-admin@computesriov-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
20-23

[tripleo-admin@computesriov-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
4-19
[tripleo-admin@computesriov-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
0,1,2,3

######################################################################
# Failed deployment (3 vCPU, asymmetric, and mixed dedicated policy) #
######################################################################

(overcloud) [stack@undercloud-0 ~]$ openstack flavor show tempest-MixedCPUPolicyTestMultiNuma-flavor-1583852539
/usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense.
  warnings.warn(
+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                      | Value                                                                                                                                                    |
+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                                                                                                                                    |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                                                        |
| access_project_ids         | None                                                                                                                                                     |
| description                | None                                                                                                                                                     |
| disk                       | 1                                                                                                                                                        |
| id                         | 1495385529                                                                                                                                               |
| name                       | tempest-MixedCPUPolicyTestMultiNuma-flavor-1583852539                                                                                                    |
| os-flavor-access:is_public | True                                                                                                                                                     |
| properties                 | hw:cpu_dedicated_mask='^0', hw:cpu_policy='mixed', hw:numa_cpus.0='0', hw:numa_cpus.1='1,2', hw:numa_mem.0='256', hw:numa_mem.1='768', hw:numa_nodes='2' |
| ram                        | 1024                                                                                                                                                     |
| rxtx_factor                | 1.0                                                                                                                                                      |
| swap                       |                                                                                                                                                          |
| vcpus                      | 3                                                                                                                                                        |
+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+

nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([20,22]),id=0,memory=128210,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([32,34,36,38,24,26,28,30]),pinned_cpus=set([]),siblings=[set([36]),set([24]),set([20]),set([30]),set([38]),set([22]),set([26]),set([28]),set([32]),set([34])],socket=0) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929
nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956
nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021
nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{36}, {24}, set(), {30}, {38}, set(), {26}, {28}, {32}, {34}]    instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658
nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Built sibling_sets: defaultdict(<class 'list'>, {1: [{36}, {24}, {30}, {38}, {26}, {28}, {32}, {34}]}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679
nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] User did not specify a thread policy. Using default for 1 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:794
nova-scheduler.log:2022-10-17 16:25:17.164 12 INFO nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[36], [24], [30], [38], [26], [28], [32], [34]], vCPUs mapping: []
nova-scheduler.log:2022-10-17 16:25:17.165 12 INFO nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[36], [24], [30], [38], [26], [28], [32], [34]], vCPUs mapping: []
nova-scheduler.log:2022-10-17 16:25:17.165 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049
nova-scheduler.log:2022-10-17 16:25:17.165 12 DEBUG nova.scheduler.filters.numa_topology_filter [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] [instance: fa4f0399-3507-42e8-a02d-5fd1269b3762] computesriov-1.localdomain, computesriov-1.localdomain fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3.9/site-packages/nova/scheduler/filters/numa_topology_filter.py:106

** Affects: nova
     Importance: Undecided
     Assignee: Balazs Gibizer (balazs-gibizer)
         Status: New


** Tags: numa scheduler

** Changed in: nova
     Assignee: (unassigned) => Balazs Gibizer (balazs-gibizer)

** Tags added: numa scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1994526

Title:
  Asymmetric multi NUMA guest with cpu_policy: mixed fails to schedule

Status in OpenStack Compute (nova):
  New

Bug description:
  Based on https://bugzilla.redhat.com/show_bug.cgi?id=2135439

  Compute Host NUMA Topology:

    NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
    NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39

  Hosts Dedicated/Shared Configurations:
  [tripleo-admin@computesriov-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
  24-39
  [tripleo-admin@computesriov-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
  20-23

  [tripleo-admin@computesriov-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
  4-19
  [tripleo-admin@computesriov-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
  0,1,2,3

  ######################################################################
  # Failed deployment (3 vCPU, asymmetric, and mixed dedicated policy) #
  ######################################################################

  (overcloud) [stack@undercloud-0 ~]$ openstack flavor show tempest-MixedCPUPolicyTestMultiNuma-flavor-1583852539
  /usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense.
    warnings.warn(
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field                      | Value                                                                                                                                                    |
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
  | OS-FLV-DISABLED:disabled   | False                                                                                                                                                    |
  | OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                                                        |
  | access_project_ids         | None                                                                                                                                                     |
  | description                | None                                                                                                                                                     |
  | disk                       | 1                                                                                                                                                        |
  | id                         | 1495385529                                                                                                                                               |
  | name                       | tempest-MixedCPUPolicyTestMultiNuma-flavor-1583852539                                                                                                    |
  | os-flavor-access:is_public | True                                                                                                                                                     |
  | properties                 | hw:cpu_dedicated_mask='^0', hw:cpu_policy='mixed', hw:numa_cpus.0='0', hw:numa_cpus.1='1,2', hw:numa_mem.0='256', hw:numa_mem.1='768', hw:numa_nodes='2' |
  | ram                        | 1024                                                                                                                                                     |
  | rxtx_factor                | 1.0                                                                                                                                                      |
  | swap                       |                                                                                                                                                          |
  | vcpus                      | 3                                                                                                                                                        |
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+

  nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([20,22]),id=0,memory=128210,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([32,34,36,38,24,26,28,30]),pinned_cpus=set([]),siblings=[set([36]),set([24]),set([20]),set([30]),set([38]),set([22]),set([26]),set([28]),set([32]),set([34])],socket=0) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929
  nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956
  nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021
  nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{36}, {24}, set(), {30}, {38}, set(), {26}, {28}, {32}, {34}]    instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658
  nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Built sibling_sets: defaultdict(<class 'list'>, {1: [{36}, {24}, {30}, {38}, {26}, {28}, {32}, {34}]}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679
  nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] User did not specify a thread policy. Using default for 1 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:794
  nova-scheduler.log:2022-10-17 16:25:17.164 12 INFO nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[36], [24], [30], [38], [26], [28], [32], [34]], vCPUs mapping: []
  nova-scheduler.log:2022-10-17 16:25:17.165 12 INFO nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[36], [24], [30], [38], [26], [28], [32], [34]], vCPUs mapping: []
  nova-scheduler.log:2022-10-17 16:25:17.165 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049
  nova-scheduler.log:2022-10-17 16:25:17.165 12 DEBUG nova.scheduler.filters.numa_topology_filter [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] [instance: fa4f0399-3507-42e8-a02d-5fd1269b3762] computesriov-1.localdomain, computesriov-1.localdomain fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3.9/site-packages/nova/scheduler/filters/numa_topology_filter.py:106

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1994526/+subscriptions



Follow ups