← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1994526] Re: Asymmetric multi NUMA guest with cpu_policy: mixed fails to schedule

 

Reviewed:  https://review.opendev.org/c/openstack/nova/+/862687
Committed: https://opendev.org/openstack/nova/commit/cffe3971ce585a1ddc374a3ed067347857338831
Submitter: "Zuul (22348)"
Branch:    master

commit cffe3971ce585a1ddc374a3ed067347857338831
Author: Balazs Gibizer <gibi@xxxxxxxxxx>
Date:   Wed Oct 26 13:28:47 2022 +0200

    Handle zero pinned CPU in a cell with mixed policy
    
    When cpu_policy is mixed the scheduler tries to find a valid CPU pinning
    for each instance NUMA cell. However if there is an instance NUMA cell
    that does not request any pinned CPUs then such logic will calculate
    empty pinning information for that cell. Then the scheduler logic
    wrongly assumes that an empty pinning result means there was no valid
    pinning. However there is difference between a None result when no valid
    pinning found, from an empty result [] which means there was nothing to
    pin.
    
    This patch makes sure that pinning == None is differentiated from
    pinning == [].
    
    Closes-Bug: #1994526
    Change-Id: I5a35a45abfcfbbb858a94927853777f112e73e5b


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1994526

Title:
  Asymmetric multi NUMA guest with cpu_policy: mixed fails to schedule

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Based on https://bugzilla.redhat.com/show_bug.cgi?id=2135439

  Compute Host NUMA Topology:

    NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
    NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39

  Hosts Dedicated/Shared Configurations:
  [tripleo-admin@computesriov-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
  24-39
  [tripleo-admin@computesriov-1 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
  20-23

  [tripleo-admin@computesriov-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_dedicated_set
  4-19
  [tripleo-admin@computesriov-0 ~]$ sudo crudini --get /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf compute cpu_shared_set
  0,1,2,3

  ######################################################################
  # Failed deployment (3 vCPU, asymmetric, and mixed dedicated policy) #
  ######################################################################

  (overcloud) [stack@undercloud-0 ~]$ openstack flavor show tempest-MixedCPUPolicyTestMultiNuma-flavor-1583852539
  /usr/lib/python3.9/site-packages/openstack/config/cloud_region.py:452: UserWarning: You have a configured API_VERSION with 'latest' in it. In the context of openstacksdk this doesn't make any sense.
    warnings.warn(
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
  | Field                      | Value                                                                                                                                                    |
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
  | OS-FLV-DISABLED:disabled   | False                                                                                                                                                    |
  | OS-FLV-EXT-DATA:ephemeral  | 0                                                                                                                                                        |
  | access_project_ids         | None                                                                                                                                                     |
  | description                | None                                                                                                                                                     |
  | disk                       | 1                                                                                                                                                        |
  | id                         | 1495385529                                                                                                                                               |
  | name                       | tempest-MixedCPUPolicyTestMultiNuma-flavor-1583852539                                                                                                    |
  | os-flavor-access:is_public | True                                                                                                                                                     |
  | properties                 | hw:cpu_dedicated_mask='^0', hw:cpu_policy='mixed', hw:numa_cpus.0='0', hw:numa_cpus.1='1,2', hw:numa_mem.0='256', hw:numa_mem.1='768', hw:numa_nodes='2' |
  | ram                        | 1024                                                                                                                                                     |
  | rxtx_factor                | 1.0                                                                                                                                                      |
  | swap                       |                                                                                                                                                          |
  | vcpus                      | 3                                                                                                                                                        |
  +----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+

  nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([])) on host_cell NUMACell(cpu_usage=0,cpuset=set([20,22]),id=0,memory=128210,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([32,34,36,38,24,26,28,30]),pinned_cpus=set([]),siblings=[set([36]),set([24]),set([20]),set([30]),set([38]),set([22]),set([26]),set([28]),set([32]),set([34])],socket=0) _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:929
  nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] No specific pagesize requested for instance, selected pagesize: 4 _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:956
  nova-scheduler.log:2022-10-17 16:25:17.163 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Instance has requested pinned CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1021
  nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Packing an instance onto a set of siblings:     host_cell_free_siblings: [{36}, {24}, set(), {30}, {38}, set(), {26}, {28}, {32}, {34}]    instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='mixed',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([0]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([]))    host_cell_id: 0    threads_per_core: 1    num_cpu_reserved: 0 _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:658
  nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Built sibling_sets: defaultdict(<class 'list'>, {1: [{36}, {24}, {30}, {38}, {26}, {28}, {32}, {34}]}) _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:679
  nova-scheduler.log:2022-10-17 16:25:17.164 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] User did not specify a thread policy. Using default for 1 cores _pack_instance_onto_cores /usr/lib/python3.9/site-packages/nova/virt/hardware.py:794
  nova-scheduler.log:2022-10-17 16:25:17.164 12 INFO nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[36], [24], [30], [38], [26], [28], [32], [34]], vCPUs mapping: []
  nova-scheduler.log:2022-10-17 16:25:17.165 12 INFO nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Computed NUMA topology CPU pinning: usable pCPUs: [[36], [24], [30], [38], [26], [28], [32], [34]], vCPUs mapping: []
  nova-scheduler.log:2022-10-17 16:25:17.165 12 DEBUG nova.virt.hardware [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] Failed to map instance cell CPUs to host cell CPUs _numa_fit_instance_cell /usr/lib/python3.9/site-packages/nova/virt/hardware.py:1049
  nova-scheduler.log:2022-10-17 16:25:17.165 12 DEBUG nova.scheduler.filters.numa_topology_filter [req-6307d41d-02a6-40f7-88d8-c7fcc2714202 0bfbbc8dfa1548819b0af96786b1ad43 12a4c6a7df9947a39f54dedf04b22fea - default default] [instance: fa4f0399-3507-42e8-a02d-5fd1269b3762] computesriov-1.localdomain, computesriov-1.localdomain fails NUMA topology requirements. The instance does not fit on this host. host_passes /usr/lib/python3.9/site-packages/nova/scheduler/filters/numa_topology_filter.py:106

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1994526/+subscriptions



References