← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1461777] [NEW] NUMA cell overcommit can leave NUMA cells unused

 

Public bug reported:

NUMA cell overcommit can leave NUMA cells unused

When no NUMA configuration is defined for the guest (no flavor extra specs),
nova identifies the NUMA topology of the host and tries to match the cpu 
placement to a NUMA cell ("cpuset"). 

The cpuset is selected randomly.
pin_cpuset = random.choice(viable_cells_cpus) #nova/virt/libvirt/driver.py

However, this can lead to NUMA cells not being used.
This is particular noticeable when the flavor as the same number of vcpus 
as the host NUMA cells and in the host CPUs are not overcommit (cpu_allocation_ratio = 1)

###
Particular use case:

Compute nodes with the NUMA topology:
<VirtNUMAHostTopology: {'cells': [{'mem': {'total': 12279, 'used': 0}, 'cpu_usage': 0, 'cpus': '0,1,2,3,8,9,10,11', 'id': 0}, {'mem': {'total': 12288, 'used': 0}, 'cpu_usage': 0, 'cpus': '4,5,6,7,12,13,14,15', 'id': 1}]}>

No CPU overcommit: cpu_allocation_ratio = 1
Boot instances using a flavor with 8 vcpus. 
(No NUMA topology defined for the guest in the flavor)

In this particular case the host can have 2 instances. (no cpu overcommit)
Both instances can be allocated (random) with the same cpuset from the 2 options:
<vcpu placement='static' cpuset='4-7,12-15'>8</vcpu>
<vcpu placement='static' cpuset='0-3,8-11'>8</vcpu>

As consequence half of the host CPUs are not used.


###
How to reproduce:

Using: nova 2014.2.2
(not tested in trunk however the code path looks similar)

1. set cpu_allocation_ratio = 1
2. Identify the NUMA topology of the compute node
3. Using a flavor with a number of vcpus that matches a NUMA cell in the compute node,
boot instances until fill the compute node.
4. Check the cpu placement "cpuset" used by the each instance.

Notes: 
- at this point instances can use the same "cpuset" leaving NUMA cells unused.
- the selection of the cpuset is random. Different tries may be needed.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1461777

Title:
  NUMA cell overcommit can leave NUMA cells unused

Status in OpenStack Compute (Nova):
  New

Bug description:
  NUMA cell overcommit can leave NUMA cells unused

  When no NUMA configuration is defined for the guest (no flavor extra specs),
  nova identifies the NUMA topology of the host and tries to match the cpu 
  placement to a NUMA cell ("cpuset"). 

  The cpuset is selected randomly.
  pin_cpuset = random.choice(viable_cells_cpus) #nova/virt/libvirt/driver.py

  However, this can lead to NUMA cells not being used.
  This is particular noticeable when the flavor as the same number of vcpus 
  as the host NUMA cells and in the host CPUs are not overcommit (cpu_allocation_ratio = 1)

  ###
  Particular use case:

  Compute nodes with the NUMA topology:
  <VirtNUMAHostTopology: {'cells': [{'mem': {'total': 12279, 'used': 0}, 'cpu_usage': 0, 'cpus': '0,1,2,3,8,9,10,11', 'id': 0}, {'mem': {'total': 12288, 'used': 0}, 'cpu_usage': 0, 'cpus': '4,5,6,7,12,13,14,15', 'id': 1}]}>

  No CPU overcommit: cpu_allocation_ratio = 1
  Boot instances using a flavor with 8 vcpus. 
  (No NUMA topology defined for the guest in the flavor)

  In this particular case the host can have 2 instances. (no cpu overcommit)
  Both instances can be allocated (random) with the same cpuset from the 2 options:
  <vcpu placement='static' cpuset='4-7,12-15'>8</vcpu>
  <vcpu placement='static' cpuset='0-3,8-11'>8</vcpu>

  As consequence half of the host CPUs are not used.

  
  ###
  How to reproduce:

  Using: nova 2014.2.2
  (not tested in trunk however the code path looks similar)

  1. set cpu_allocation_ratio = 1
  2. Identify the NUMA topology of the compute node
  3. Using a flavor with a number of vcpus that matches a NUMA cell in the compute node,
  boot instances until fill the compute node.
  4. Check the cpu placement "cpuset" used by the each instance.

  Notes: 
  - at this point instances can use the same "cpuset" leaving NUMA cells unused.
  - the selection of the cpuset is random. Different tries may be needed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1461777/+subscriptions


Follow ups

References