← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1910466] Re: NUMA instance spawn fails on get_best_cpu_topology when there is no 'threads' preference

 

** Also affects: nova/wallaby
   Importance: Undecided
       Status: New

** Changed in: nova/wallaby
       Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1910466

Title:
  NUMA instance spawn fails on get_best_cpu_topology when there is no
  'threads' preference

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) wallaby series:
  Fix Released

Bug description:
  Seen downstream in a customer environment where a NUMA instance fails
  driver.spawn during get_best_cpu_topology when (1) there was no
  preference for cpu threads in the flavor and (2) the only possible
  topologies have > 1 cpu threads:

  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f] Instance failed to spawn: IndexError: list index out of range
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f] Traceback (most recent call last):
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2273, in _build_resources
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     yield resources
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2053, in _build_and_run_instance
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     block_device_info=block_device_info)
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3133, in spawn
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     mdevs=mdevs)
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5447, in _get_guest_xml
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     context, mdevs)
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5202, in _get_guest_config
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     instance.numa_topology)
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3923, in _get_guest_cpu_config
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     flavor, image_meta, numa_topology=instance_numa_topology)
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]   File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 625, in get_best_cpu_topology
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f]     allow_threads, numa_topology)[0]
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [instance: 890325b1-4d73-4a9c-86c0-c4e811690c3f] IndexError: list index out of range

  Note that the IndexError ^ is a separate unrelated bug that happens
  because an empty list [] is being returned after filtering for NUMA
  threads. This bug is about the empty list after NUMA threads
  filtering.

  In this example failure we have a request for vcpus=8 with limits
  cores=2, sockets=2, and threads=8:

  2020-12-27 23:45:12.322 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Getting desirable topologi
  es for flavor Flavor(created_at=2020-11-23T08:41:17Z,deleted=False,deleted_at=None,description=None,disabled=False,ephemeral_gb=0,extra_specs={hw:cpu_max_cores='2',hw:cpu_max_sockets='2',hw:cpu_max_thread
  s='8',hw:cpu_policy='dedicated',hw:mem_page_size='large',hw:numa_cpus.0='0,1,2,3',hw:numa_cpus.1='4,5,6,7',hw:numa_mem.0='16384',hw:numa_mem.1='16384',hw:numa_mempolicy='strict',hw:numa_nodes='2'},flavori
  d='2060ed99-654c-4309-88b8-9bceeb794ba3',id=176,is_public=True,memory_mb=32768,name='test',projects=<?>,root_gb=40,rxtx_factor=1.0,swap=0,updated_at=None,vcpu_weight=0,vcpus=8) and image_meta ImageMeta(ch
  ecksum='157e26aac48c1a02c08d07f4f7a6d1b6',container_format='bare',created_at=2020-03-08T16:12:44Z,direct_url=<?>,disk_format='qcow2',id=7812d228-07c8-4a8f-9878-459a0093cc34,min_disk=0,min_ram=0,name='UAG_
  generic-180104',owner='7a2328acc0a6451c8b23fb8184932506',properties=ImageMetaProps,protected=<?>,size=769468416,status='active',tags=<?>,updated_at=2020-03-23T06:11:54Z,virtual_size=<?>,visibility=<?>), a
  llow threads: True _get_desirable_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:567
  2020-12-27 23:45:12.323 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Flavor limits 2:2:8 _get_c
  pu_topology_constraints /usr/lib/python2.7/site-packages/nova/virt/hardware.py:313
  2020-12-27 23:45:12.323 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Image limits 2:2:8 _get_cpu_topology_constraints /usr/lib/python2.7/site-packages/nova/virt/hardware.py:324
  2020-12-27 23:45:12.324 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Flavor pref -1:-1:-1 _get_cpu_topology_constraints /usr/lib/python2.7/site-packages/nova/virt/hardware.py:347
  2020-12-27 23:45:12.324 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Image pref -1:-1:-1 _get_cpu_topology_constraints /usr/lib/python2.7/site-packages/nova/virt/hardware.py:366
  2020-12-27 23:45:12.325 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Chosen -1:-1:-1 limits 2:2:8 _get_cpu_topology_constraints /usr/lib/python2.7/site-packages/nova/virt/hardware.py:395
  2020-12-27 23:45:12.325 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Topology preferred VirtCPUTopology(cores=-1,sockets=-1,threads=-1), maximum VirtCPUTopology(cores=2,sockets=2,threads=8) _get_desirable_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:571
  2020-12-27 23:45:12.325 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Build topologies for 8 vcpu(s) 2:2:8 _get_possible_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:434
  2020-12-27 23:45:12.326 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Got 4 possible topologies _get_possible_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:461
  2020-12-27 23:45:12.326 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Possible topologies [VirtCPUTopology(cores=2,sockets=2,threads=2), VirtCPUTopology(cores=1,sockets=2,threads=4), VirtCPUTopology(cores=2,sockets=1,threads=4), VirtCPUTopology(cores=1,sockets=1,threads=8)] _get_desirable_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:576
  2020-12-27 23:45:12.326 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Filtering topologies best for 1 threads _get_desirable_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:594
  2020-12-27 23:45:12.327 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Remaining possible topologies [] _get_desirable_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:599
  2020-12-27 23:45:12.327 1 DEBUG nova.virt.hardware [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] Sorted desired topologies [] _get_desirable_cpu_topologies /usr/lib/python2.7/site-packages/nova/virt/hardware.py:602
  2020-12-27 23:45:12.327 1 ERROR nova.compute.manager [req-321f4757-b777-4b6a-93d9-26352fe343a3 8529101f4f0c482ea4b82f9a955785cf 7a2328acc0a6451c8b23fb8184932506 - default default] [instance: 890325b1-4d73
  -4a9c-86c0-c4e811690c3f] Instance failed to spawn: IndexError: list index out of range

  This is showing that the flavor and image specified no preference for
  sockets, cores, and threads (they are all -1) and 8 vcpus are
  required. The possible topologies that would satisfy the request for 8
  vcpus and stay within the limits of 2 max cores, 2 max sockets, and 8
  max threads are: [VirtCPUTopology(cores=2,sockets=2,threads=2),
  VirtCPUTopology(cores=1,sockets=2,threads=4),
  VirtCPUTopology(cores=2,sockets=1,threads=4),
  VirtCPUTopology(cores=1,sockets=1,threads=8)] . When there is no
  preference for the number of threads, the code will use a value of 1
  for the desired number of threads. It will then filter for the closest
  number of threads that does not exceed the desired number of threads.
  Because only 4 or 8 threads could satisfy the request for 8 vcpus and
  4 and 8 are greater than 1, all of the possible topologies were
  filtered out, leaving an empty list and the request could not be
  fulfilled.

  Because the request expressed no preference for number of threads, one
  of the 4 possible cpu topologies should have been chosen instead of
  filtering all of the topologies out and returning an empty list. We
  will need to fix the logic around how requests without threads
  preference are handled.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1910466/+subscriptions



References