← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1805891] [NEW] pci numa polices are not followed

 

Public bug reported:

Description
===========
https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/share-pci-between-numa-nodes.html
introduced the concept of numa affinity policies for pci passthough devices.

upon testing it was observed that the prefer policy is broken.

for contested  there is a sperate bug to track the lack of support for neutron sriov interfaces.
https://bugs.launchpad.net/nova/+bug/1795920 so the scope of this bug is limited
pci numa policies for passtrhough devices using a flavor alias.


background
----------

by default in nova pci devices are numa affinitesed using the legacy policy.
but you can override this behavior via the alias. when set to prefer nova 
should fall back to no numa affintiy bwteen the guest and the pci devce
if a device on a local numa node is not availeble.

the policies are discibed below.

legacy

    This is the default value and it describes the current nova
behavior. Usually we have information about association of PCI devices
with NUMA nodes. However, some PCI devices do not provide such
information. The legacy value will mean that nova will boot instances
with PCI device if either:

        The PCI device is associated with at least one NUMA nodes on which the instance will be booted
        There is no information about PCI-NUMA affinity available


preferred

    This value will mean that nova-scheduler will choose a compute host
with minimal consideration for the NUMA affinity of PCI devices. nova-
compute will attempt a best effort selection of PCI devices based on
NUMA affinity, however, if this is not possible then nova-compute will
fall back to scheduling on a NUMA node that is not associated with the
PCI device.

    Note that even though the NUMATopologyFilter will not consider NUMA
affinity, the weigher proposed in the Reserve NUMA Nodes with PCI
Devices Attached spec [2] can be used to maximize the chance that a
chosen host will have NUMA-affinitized PCI devices.


Steps to reproduce
==================

the test case was relitively simple

- deploy a singel node devstack install on a host with 2 numa nodes.
- enable the pci and numa topology fileters
- whitelist a pci device attach to numa_node 0
  e.g. passthrough_whitelist = { "address": "0000:01:00.1" }
- adust the vcpu_pin_set to only list the cpus on numa_node 1
  e.g. vcpu_pin_set=8-15
- crate an alias in the pci section of the nova.conf
  alias = { "vendor_id":"8086", "product_id":"10c9", "device_type":"type-PF", "name":"nic-pf", "numa_policy": "preferred"}
- restart the nova services
  sudo systemctl restart devstack@n-*

- update a flavour with the alias and a numa toplogy of 1
 openstack flavour set --property pci_passthrough:alias='nic-pf:1' 42
 openstack flavour set --property hw:numa_nodes=1 42


+----------------------------+-----------------------------------------------------+
| Field                      | Value                                               |
+----------------------------+-----------------------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                               |
| OS-FLV-EXT-DATA:ephemeral  | 0                                                   |
| access_project_ids         | None                                                |
| disk                       | 0                                                   |
| id                         | 42                                                  |
| name                       | m1.nano                                             |
| os-flavor-access:is_public | True                                                |
| properties                 | hw:numa_nodes='1', pci_passthrough:alias='nic-pf:1' |
| ram                        | 64                                                  |
| rxtx_factor                | 1.0                                                 |
| swap                       |                                                     |
| vcpus                      | 1                                                   |
+----------------------------+-----------------------------------------------------+

boot a vm with the flavor


Expected result
===============
vm boots with cpus and ram from host numa node 1
and the pci devcie for host numa node 0

Actual result
=============

the resouce tracker failst to claim the pci device as it cannot
create a guest with a virtual numa topology of 1 with a pci device form a remote numa node.
i.e. there is no fall back and the vm fails to boot due to nova trying to enforce numa affinity.

Environment
===========
1. Exact version of OpenStack you are running.
   master but i belive this will be broken on queens and rocky too.

2. Which hypervisor did you use?
   libvirt kvm.


2. Which storage type did you use?
   N/a cinder lvm and default libvirt image backend

3. Which networking type did you use?
  N/A openvswitch.

** Affects: nova
     Importance: High
     Assignee: sean mooney (sean-k-mooney)
         Status: Triaged


** Tags: numa pci

** Changed in: nova
   Importance: Undecided => High

** Changed in: nova
       Status: New => Triaged

** Changed in: nova
     Assignee: (unassigned) => sean mooney (sean-k-mooney)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1805891

Title:
  pci numa polices are not followed

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  Description
  ===========
  https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/share-pci-between-numa-nodes.html
  introduced the concept of numa affinity policies for pci passthough devices.

  upon testing it was observed that the prefer policy is broken.

  for contested  there is a sperate bug to track the lack of support for neutron sriov interfaces.
  https://bugs.launchpad.net/nova/+bug/1795920 so the scope of this bug is limited
  pci numa policies for passtrhough devices using a flavor alias.

  
  background
  ----------

  by default in nova pci devices are numa affinitesed using the legacy policy.
  but you can override this behavior via the alias. when set to prefer nova 
  should fall back to no numa affintiy bwteen the guest and the pci devce
  if a device on a local numa node is not availeble.

  the policies are discibed below.

  legacy

      This is the default value and it describes the current nova
  behavior. Usually we have information about association of PCI devices
  with NUMA nodes. However, some PCI devices do not provide such
  information. The legacy value will mean that nova will boot instances
  with PCI device if either:

          The PCI device is associated with at least one NUMA nodes on which the instance will be booted
          There is no information about PCI-NUMA affinity available


  preferred

      This value will mean that nova-scheduler will choose a compute
  host with minimal consideration for the NUMA affinity of PCI devices.
  nova-compute will attempt a best effort selection of PCI devices based
  on NUMA affinity, however, if this is not possible then nova-compute
  will fall back to scheduling on a NUMA node that is not associated
  with the PCI device.

      Note that even though the NUMATopologyFilter will not consider
  NUMA affinity, the weigher proposed in the Reserve NUMA Nodes with PCI
  Devices Attached spec [2] can be used to maximize the chance that a
  chosen host will have NUMA-affinitized PCI devices.


  Steps to reproduce
  ==================

  the test case was relitively simple

  - deploy a singel node devstack install on a host with 2 numa nodes.
  - enable the pci and numa topology fileters
  - whitelist a pci device attach to numa_node 0
    e.g. passthrough_whitelist = { "address": "0000:01:00.1" }
  - adust the vcpu_pin_set to only list the cpus on numa_node 1
    e.g. vcpu_pin_set=8-15
  - crate an alias in the pci section of the nova.conf
    alias = { "vendor_id":"8086", "product_id":"10c9", "device_type":"type-PF", "name":"nic-pf", "numa_policy": "preferred"}
  - restart the nova services
    sudo systemctl restart devstack@n-*

  - update a flavour with the alias and a numa toplogy of 1
   openstack flavour set --property pci_passthrough:alias='nic-pf:1' 42
   openstack flavour set --property hw:numa_nodes=1 42

  
  +----------------------------+-----------------------------------------------------+
  | Field                      | Value                                               |
  +----------------------------+-----------------------------------------------------+
  | OS-FLV-DISABLED:disabled   | False                                               |
  | OS-FLV-EXT-DATA:ephemeral  | 0                                                   |
  | access_project_ids         | None                                                |
  | disk                       | 0                                                   |
  | id                         | 42                                                  |
  | name                       | m1.nano                                             |
  | os-flavor-access:is_public | True                                                |
  | properties                 | hw:numa_nodes='1', pci_passthrough:alias='nic-pf:1' |
  | ram                        | 64                                                  |
  | rxtx_factor                | 1.0                                                 |
  | swap                       |                                                     |
  | vcpus                      | 1                                                   |
  +----------------------------+-----------------------------------------------------+

  boot a vm with the flavor

  
  Expected result
  ===============
  vm boots with cpus and ram from host numa node 1
  and the pci devcie for host numa node 0

  Actual result
  =============

  the resouce tracker failst to claim the pci device as it cannot
  create a guest with a virtual numa topology of 1 with a pci device form a remote numa node.
  i.e. there is no fall back and the vm fails to boot due to nova trying to enforce numa affinity.

  Environment
  ===========
  1. Exact version of OpenStack you are running.
     master but i belive this will be broken on queens and rocky too.

  2. Which hypervisor did you use?
     libvirt kvm.


  2. Which storage type did you use?
     N/a cinder lvm and default libvirt image backend

  3. Which networking type did you use?
    N/A openvswitch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1805891/+subscriptions


Follow ups