← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1983570] [NEW] cannot schedule ovs sriov offload port to tunneled segment

 

Public bug reported:

We observed a scheduling failure when using ovs sriov offload (https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html
) in combination with multisegment networks. The problem seems to affect the case when the port should be bound to a tunneled network segment (a segment that does not have a physnet).

I read that nova scheduler works the same way with pci sriov
passthrough, therefore I believe the same bug affects pci sriov
passthrough, though I did not test that.

Due to the special hardware needs for this environment I could not
reproduce this in devstack. But I hope we have collected enough
information that shows the error regardless. We believe we also
identified the relevant lines of code.

The overall setup includes l2gw - connecting the segments in the
multisegment network. But I will ignore that here, since l2gw cannot be
part of the root cause here. Neutron was configured with
mechanism_drivers=sriovnicswitch,opendaylight_v2. However since the
error happens before we bind the port, I believe the mechanism_driver is
irrelevant as long as it allows the creation of ports with "--vnic-type
direct --binding-profile '{"capabilities": ["switchdev"]}'". For the
sake of simplicity I will call these "ovs sriov offload ports".

As I understand the problem:

1) ovs sriov offload port on single segment neutron network, the segment is vxlan: works
2) normal port on no offload capable ovs (--vnic-type normal) on multisegment neutron network, one vlan, one vxlan segment, the port should be bound to the vxlan segment: works
3) ovs sriov offload port on multisegment neutron network, one vlan, one vxlan segment, the port should be bound to the vxlan segment: does not work

To reproduce:
* create a multisegment network with one vlan and one vxlan segment
* create a port on that network with "--vnic-type direct --binding-profile '{"capabilities": ["switchdev"]}' --disable-port-security --no-security-group".
* boot a vm with that port

On the compute host on which we expect the scheduling and boot to succeed we have configuration like:
[pci]
passthrough_whitelist = [{"devname": "data2", "physical_network": null}, {"devname": "data3", "physical_network": null}]

According to https://docs.openstack.org/nova/latest/admin/pci-
passthrough.html this marks the tunneled segments on this host to be
passthrough (and ovs offload) capable.

The vm boot fails with:

$ openstack server show c3_ms_1
...
| fault                               | {'code': 500, 'created': '2022-07-16T08:12:31Z', 'message': 'Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.', 'details': 'Traceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2418, in _build_and_run_instance\n    limits):\n  File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 360, in inner\n    return f(*args, **kwargs)\n  File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 172, in instance_claim\n    pci_requests, limits=limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 72, in __init__\n    self._claim_test(compute_node, limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 114, in _claim_test\n    "; ".join(reasons))\nnova.exception.ComputeResourcesUnavailable: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2271, in _do_build_and_run_instance\n    filter_properties, request_spec, accel_uuids)\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2469, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=e.format_message())\nnova.exception.RescheduledException: Build of instance 09f3f8bb-b4c0-4395-8167-c10609d32d08 was re-scheduled: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n'} |
...

In the scheduler logs we see that the scheduler uses a spec with a
physnet. But the pci passthrough capability is on a device without a
physnet.

controlhost3:/home/ceeinfra # grep DC259-CEE3- /var/log/nova/nova-scheduler.log
<180>2022-07-16T10:12:29.680009+02:00 controlhost3.dc259cee3.cloud.k2.ericsson.se nova-scheduler[67299]: 2022-07-16 10:12:29.679 76 WARNING nova.scheduler.host_manager [req-4dd7c37e-eb18-48da-9914-44a6a2a18b1d fcd3b2713191485d95befe1941f20e20 cf7024f0f2bd46a8b17fd42055a20323 - default default] Selected host: compute3.dc259cee3.cloud.k2.ericsson.se failed to consume from instance. Error: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed: nova.exception.PciDeviceRequestFailed: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed

We observed the bug originally on stable/victoria and found these source code lines:
https://opendev.org/openstack/nova/src/commit/8097c2b2153ff952a266395d4e351fc39f914c6b/nova/network/neutron.py#L2128-L2135

Here for vnic_type=direct ports we unconditionally add a physnet to the
spec:

>From victoria to master the only recent change in that piece is this:
https://opendev.org/openstack/nova/commit/0620678344d0f032a33e952d4d0fa653741f09e7 Add support for VNIC_REMOTE_MANAGED

Which seems irrelevant to this bug, therefore I believe the bug could be
reproduced on master too.

Not having reproduced this bug myself, I will include below my
colleague's (Angelo Nappo's) email to me, containing exact commands and
configs. However I hope above I already provided all relevant
information and eliminated all downstream specific details:

The use case in legacy NFVI solution

The system is based on VXLAN+SDN. If the VM has to reach any host outside the data center:
    a MS network is created
    l2gw connection si created, so the segments are “joined together” in the switch fabric that does the vlan to vxlan transformation. 

For example:

openstack network create --provider-network-type vxlan --provider-segment 2638 esohtom_ms_net_vlan_638
openstack network segment create --network-type vxlan --network esohtom_ms_net_vlan_638 vxlan_segment
openstack network segment create --physical-network DC259-CEE3-DCGW-NET --segment 638 --network-type vlan --network esohtom_ms_net_vlan_638 vlan_dcgw_segment

ceeinfra@controlhost1:~> openstack network segment list

+--------------------------------------+-------------------+--------------------------------------+--------------+---------+
| ID                                   | Name              | Network                              | Network Type | Segment |
+--------------------------------------+-------------------+--------------------------------------+--------------+---------+
| 1adda535-7c37-4db3-a153-976b93649957 | vxlan_segment     | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5 | vxlan        |    2700 |
| 29b3505c-d70b-4c24-b1b1-b0e27e64f120 | None              | cf84212f-e66b-4455-8415-d92492639f16 | flat         |    None |
| 506c76db-4ab6-4453-9d54-63c7e89d4a34 | vlan_dcgw_segment | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5 | vlan         |     638 |
| 578d3f85-d745-4e40-8e02-69d400c5e6c5 | None              | a5736c68-e4a4-42b2-85ae-d5134e866dae | vxlan        |    2099 |
| 78b21ff9-29a8-4b5d-84dc-fcf00de8f697 | None              | 04e890ab-01e3-442d-8e13-8d32f25f3ff3 | vxlan        |    2374 |
| 7b112fcd-b3d3-409c-8b83-184f4a253592 | None              | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5 | vxlan        |    2638 |
| 8ca5d660-8eba-4024-8bc8-023f2cab2e9c | None              | b49ee119-cbc6-45f7-bc07-1b11dd602771 | vxlan        |    2441 |
| 977147b3-3b26-4d7e-badc-721749d0cf65 | None              | d6d1f239-cfce-49d8-9cb4-a1e1305a5981 | vxlan        |    2563 |
| 9b6369fe-dd7d-4d4b-b7eb-f0d3b263562e | None              | b922cbe8-303d-4708-ad1c-59c1ac48de58 | vxlan        |    2057 |
| cd6f295e-cf41-4c3f-911a-5490ff790313 | None              | ed9b77ab-b225-4f63-ae7c-2e992c7f8335 | vxlan        |    2795 |
+--------------------------------------+-------------------+--------------------------------------+--------------+---------+

Note: The second vxlan segment (2700) is created to see if it make any
difference but is for sure not needed and not wanted.

openstack subnet create --network esohtom_ms_net_vlan_638 --dhcp --gateway 172.60.38.1 --allocation-pool start=172.60.38.10,end=172.60.38.100 --subnet-range 172.60.38.1/24 esohtom_vn_subnet
openstack port create --vnic-type normal --disable-port-security --no-security-group --network esohtom_ms_net_vlan_638 norm_MS_1

neutron l2-gateway-connection-create --default-segmentation-id 638 L2GW_PHY1_LeafCluster001 esohtom_ms_net_vlan_638
neutron l2-gateway-connection-create --default-segmentation-id 638 L2GW_PHY0_LeafCluster001 esohtom_ms_net_vlan_638
neutron l2-gateway-connection-create --default-segmentation-id 638 L2GW_BORDER_LeafCluster001 esohtom_ms_net_vlan_638

ceeinfra@controlhost1:~> openstack server create --image BAT-image --flavor flavor_1 --port norm_MS_1 --availability-zone=nova:compute1.dc259cee3.cloud.k2.ericsson.se c1_ms_1
+-------------------------------------+--------------------------------------------------+
| Field                               | Value                                            |
+-------------------------------------+--------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                           |
| OS-EXT-AZ:availability_zone         | nova                                             |
| OS-EXT-SRV-ATTR:host                | None                                             |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                             |
| OS-EXT-SRV-ATTR:instance_name       |                                                  |
| OS-EXT-STS:power_state              | NOSTATE                                          |
| OS-EXT-STS:task_state               | scheduling                                       |
| OS-EXT-STS:vm_state                 | building                                         |
| OS-SRV-USG:launched_at              | None                                             |
| OS-SRV-USG:terminated_at            | None                                             |
| accessIPv4                          |                                                  |
| accessIPv6                          |                                                  |
| addresses                           |                                                  |
| adminPass                           | LCnpM4wskqoL                                     |
| config_drive                        |                                                  |
| created                             | 2022-07-16T07:18:45Z                             |
| flavor                              | flavor_1 (e3a09880-dd50-46dd-87be-0630d111cd00)  |
| hostId                              |                                                  |
| id                                  | 5fab4bbc-0f41-4f68-bb59-66100c092b2c             |
| image                               | BAT-image (a2e36de0-765a-4735-a603-471df8983238) |
| key_name                            | None                                             |
| name                                | c1_ms_1                                          |
| progress                            | 0                                                |
| project_id                          | cf7024f0f2bd46a8b17fd42055a20323                 |
| properties                          |                                                  |
| security_groups                     | name='default'                                   |
| status                              | BUILD                                            |
| updated                             | 2022-07-16T07:18:45Z                             |
| user_id                             | fcd3b2713191485d95befe1941f20e20                 |
| volumes_attached                    |                                                  |
+-------------------------------------+--------------------------------------------------+

ceeinfra@controlhost1:~> openstack server list
+--------------------------------------+-------------+---------+--------------------------------------------+-----------+----------+
| ID                                   | Name        | Status  | Networks                                   | Image     | Flavor   |
+--------------------------------------+-------------+---------+--------------------------------------------+-----------+----------+
| 5fab4bbc-0f41-4f68-bb59-66100c092b2c | c1_ms_1     | ACTIVE  | esohtom_ms_net_vlan_638=172.60.38.63       | BAT-image | flavor_1 |
| 5cb48a3e-36c0-4558-afc0-94834c38ba84 | c3_svf_1    | ACTIVE  | network2=50.50.50.174                      | BAT-image | flavor_1 |
| e343d12a-3336-4075-8131-0e9552e46cb6 | c3_trex_281 | ACTIVE  | network1=40.0.0.123; network2=50.50.50.81  | trex_2_81 | flavor_3 |
| cd492514-39a9-4b78-b1fc-9a5d5cc91191 | VM2         | ACTIVE  | trunk-net=10.10.10.222                     | BAT-image | flavor_1 |
| 2806ff70-86ec-47a6-97a1-20e859e8e188 | VM1         | ACTIVE  | trunk-net=10.10.10.130                     | BAT-image | flavor_1 |
| 58e47d2d-d21d-4a16-809c-67b1e0ffcbca | c4_trex_281 | ACTIVE  | network1=40.0.0.135; network2=50.50.50.252 | trex_2_81 | flavor_3 |
| 2ba6d7ce-4092-4262-8a59-396db5a7aa0d | c4_svf_2    | ACTIVE  | network2=50.50.50.88                       | BAT-image | flavor_1 |
| 2e434248-996b-4f3a-99fe-715384ac7e90 | c4_svf_1    | SHUTOFF | network2=50.50.50.149                      | BAT-image | flavor_1 |
| 8990cec4-ee28-406e-93a0-a8ca2e1b95ef | c2_2        | ACTIVE  | network2=50.50.50.107                      | BAT-image | flavor_3 |
| 59e4a587-74c9-47c5-8c07-e8f2dbb76f50 | c2_trex_281 | ACTIVE  | network1=40.0.0.216; network2=50.50.50.136 | trex_2_81 | flavor_3 |
| a0446d12-e8e8-495f-9dca-bd3c658c2fce | c1_test     | ACTIVE  | network2=50.50.50.52                       | BAT-image | flavor_1 |
| 7aed1ccd-2ec9-45d8-b25b-39cec598cb1b | c1_2        | ACTIVE  | network2=50.50.50.92                       | BAT-image | flavor_1 |
| f1ff093b-60ae-4c59-a35b-3f320d99551f | c1_trex_281 | ACTIVE  | network1=40.0.0.83; network2=50.50.50.162  | trex_2_81 | flavor_3 |
| 16c439ba-0297-4c0a-9afe-b41f4f195143 | c1_1        | ACTIVE  | network2=50.50.50.68                       | BAT-image | flavor_1 |
| 7d5472ee-8fcc-4373-affb-a15a3bdd5e4e | c2_1        | ACTIVE  | network1=40.0.0.15; network2=50.50.50.236  | BAT-image | flavor_1 |
+--------------------------------------+-------------+---------+--------------------------------------------+-----------+----------+

This VM can actually ping the DC gateway. The compute where the VM is
running uses the legacy vswitch (OVS-dpdk) with no offload capabilities.

controlhost1:/home/ceeinfra # cat /etc/kolla/neutron-server/ml2_conf.ini
[ml2]
type_drivers = vxlan,vlan,flat
tenant_network_types = vxlan,vlan,flat
mechanism_drivers = sriovnicswitch,opendaylight_v2
extension_drivers = port_security,qos
path_mtu = 2140
physical_network_mtus = default:2140,DC259-CEE3-PHY0:2140,DC259-CEE3-PHY1:2140,DC259-CEE3-MLAG:2140,DC259-CEE3-DCGW-NET:2140,DC259-CEE3-MLAG_LEFT:2140,DC259-CEE3-MLAG_RIGHT:2140

[ml2_type_vlan]
network_vlan_ranges = DC259-CEE3-PHY0,DC259-CEE3-PHY1,DC259-CEE3-MLAG,DC259-CEE3-DCGW-NET,DC259-CEE3-MLAG_LEFT,DC259-CEE3-MLAG_RIGHT

[ml2_type_flat]
flat_networks = DC259-CEE3-PHY0,DC259-CEE3-PHY1,DC259-CEE3-MLAG

[ml2_type_vxlan]
vni_ranges = 2001:2999

[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver

[ml2_sdi]

[ml2_bsp]

[ml2_odl]

controlhost1:/home/ceeinfra # cat /etc/kolla/neutron-server/ml2_conf_odl.ini
[ml2_odl]
url = redacted
username = redacted
password = redacted
enable_dhcp_service = False
enable_full_sync = false
port_binding_controller = "pseudo-agentdb-binding"
enable_websocket_pseudo_agentdb = False
odl_features = "operational-port-status"

# scheduler config
[filter_scheduler]
enabled_filters = AggregateMultiTenancyIsolation,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,AggregateInstanceExtraSpecsFilter,SameHostFilter,DifferentHostFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,PciPassthroughFilter,NUMATopologyFilter

The use case in NFVI solution including smartNIC (OVS kernel datapath
offload to smart VFs)

Now I create the neutron port representing a smart VF in the MS network,
for which I have the necessary capabilities in compute3 and 4:

ceeinfra@controlhost1:~> openstack port create --vnic-type direct --vnic-type=direct --binding-profile='{"capabilities": ["switchdev"]}' --disable-port-security --no-security-group --network esohtom_ms_net_vlan_638 SVF_MS_1
+-------------------------+-----------------------------------------------------------------------------+
| Field                   | Value                                                                       |
+-------------------------+-----------------------------------------------------------------------------+
| admin_state_up          | UP                                                                          |
| allowed_address_pairs   |                                                                             |
| binding_host_id         |                                                                             |
| binding_profile         | capabilities='['switchdev']'                                                |
| binding_vif_details     |                                                                             |
| binding_vif_type        | unbound                                                                     |
| binding_vnic_type       | direct                                                                      |
| created_at              | 2022-07-16T08:04:57Z                                                        |
| data_plane_status       | None                                                                        |
| description             |                                                                             |
| device_id               |                                                                             |
| device_owner            |                                                                             |
| dns_assignment          | None                                                                        |
| dns_domain              | None                                                                        |
| dns_name                | None                                                                        |
| extra_dhcp_opts         |                                                                             |
| fixed_ips               | ip_address='172.60.38.94', subnet_id='1800e806-6d24-4c45-a586-935a7ba1d1c5' |
| id                      | e79a0e66-debd-42e8-a46d-9cdc29a7c960                                        |
| ip_allocation           | immediate                                                                   |
| mac_address             | fa:16:3e:99:4a:3e                                                           |
| name                    | SVF_MS_1                                                                    |
| network_id              | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5                                        |
| numa_affinity_policy    | None                                                                        |
| port_security_enabled   | False                                                                       |
| project_id              | cf7024f0f2bd46a8b17fd42055a20323                                            |
| propagate_uplink_status | None                                                                        |
| qos_network_policy_id   | None                                                                        |
| qos_policy_id           | None                                                                        |
| resource_request        | None                                                                        |
| revision_number         | 1                                                                           |
| security_group_ids      |                                                                             |
| status                  | DOWN                                                                        |
| tags                    |                                                                             |
| trunk_details           | None                                                                        |
| updated_at              | 2022-07-16T08:04:57Z                                                        |
+-------------------------+-----------------------------------------------------------------------------+

On those computes the nova-compute configuration includes:

[pci]
passthrough_whitelist = [{"devname": "data2", "physical_network": null}, {"devname": "data3", "physical_network": null}]

Please note that according to
https://docs.openstack.org/nova/latest/admin/pci-passthrough.html we use
an empty physical network when we want to address all the NICs that
carry an overlay network.

This works perfectly in case of server creation with smartVF on single
segment neutron networks vxlan type.

Now I try to create the server with the smart VF on compute3 on the MS
network:

ceeinfra@controlhost1:~> openstack server create --image BAT-image --flavor flavor_1 --port SVF_MS_1 --availability-zone=nova:compute3.dc259cee3.cloud.k2.ericsson.se c3_ms_1
+-------------------------------------+--------------------------------------------------+
| Field                               | Value                                            |
+-------------------------------------+--------------------------------------------------+
| OS-DCF:diskConfig                   | MANUAL                                           |
| OS-EXT-AZ:availability_zone         | nova                                             |
| OS-EXT-SRV-ATTR:host                | None                                             |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None                                             |
| OS-EXT-SRV-ATTR:instance_name       |                                                  |
| OS-EXT-STS:power_state              | NOSTATE                                          |
| OS-EXT-STS:task_state               | scheduling                                       |
| OS-EXT-STS:vm_state                 | building                                         |
| OS-SRV-USG:launched_at              | None                                             |
| OS-SRV-USG:terminated_at            | None                                             |
| accessIPv4                          |                                                  |
| accessIPv6                          |                                                  |
| addresses                           |                                                  |
| adminPass                           | DR6AcaeH85nt                                     |
| config_drive                        |                                                  |
| created                             | 2022-07-16T08:12:28Z                             |
| flavor                              | flavor_1 (e3a09880-dd50-46dd-87be-0630d111cd00)  |
| hostId                              |                                                  |
| id                                  | 09f3f8bb-b4c0-4395-8167-c10609d32d08             |
| image                               | BAT-image (a2e36de0-765a-4735-a603-471df8983238) |
| key_name                            | None                                             |
| name                                | c3_ms_1                                          |
| progress                            | 0                                                |
| project_id                          | cf7024f0f2bd46a8b17fd42055a20323                 |
| properties                          |                                                  |
| security_groups                     | name='default'                                   |
| status                              | BUILD                                            |
| updated                             | 2022-07-16T08:12:28Z                             |
| user_id                             | fcd3b2713191485d95befe1941f20e20                 |
| volumes_attached                    |                                                  |
+-------------------------------------+--------------------------------------------------+

ceeinfra@controlhost1:~> openstack server show c3_ms_1
...
| fault                               | {'code': 500, 'created': '2022-07-16T08:12:31Z', 'message': 'Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.', 'details': 'Traceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2418, in _build_and_run_instance\n    limits):\n  File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 360, in inner\n    return f(*args, **kwargs)\n  File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 172, in instance_claim\n    pci_requests, limits=limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 72, in __init__\n    self._claim_test(compute_node, limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 114, in _claim_test\n    "; ".join(reasons))\nnova.exception.ComputeResourcesUnavailable: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2271, in _do_build_and_run_instance\n    filter_properties, request_spec, accel_uuids)\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2469, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=e.format_message())\nnova.exception.RescheduledException: Build of instance 09f3f8bb-b4c0-4395-8167-c10609d32d08 was re-scheduled: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n'} |
...

The scheduler cannot find the needed PCI devices with the needed
capabilities on compute3…but what are the PCI devices is it looking for?

controlhost3:/home/ceeinfra # grep DC259-CEE3- /var/log/nova/nova-scheduler.log
<180>2022-07-16T10:12:29.680009+02:00 controlhost3.dc259cee3.cloud.k2.ericsson.se nova-scheduler[67299]: 2022-07-16 10:12:29.679 76 WARNING nova.scheduler.host_manager [req-4dd7c37e-eb18-48da-9914-44a6a2a18b1d fcd3b2713191485d95befe1941f20e20 cf7024f0f2bd46a8b17fd42055a20323 - default default] Selected host: compute3.dc259cee3.cloud.k2.ericsson.se failed to consume from instance. Error: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed: nova.exception.PciDeviceRequestFailed: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed

It cannot find any physical network associated to the vxlan segments and
then goes to the next segment that is a vlan segment, that has a
physical network, but DC259-CEE3-DCGW-NET is for sure not able to
provide the required PCI capabilities according to the nova-compute
passthrough_whitelist. The same happens if I skip to create explicitly
the vxlan segment 2700.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1983570

Title:
  cannot schedule ovs sriov offload port to tunneled segment

Status in OpenStack Compute (nova):
  New

Bug description:
  We observed a scheduling failure when using ovs sriov offload (https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html
  ) in combination with multisegment networks. The problem seems to affect the case when the port should be bound to a tunneled network segment (a segment that does not have a physnet).

  I read that nova scheduler works the same way with pci sriov
  passthrough, therefore I believe the same bug affects pci sriov
  passthrough, though I did not test that.

  Due to the special hardware needs for this environment I could not
  reproduce this in devstack. But I hope we have collected enough
  information that shows the error regardless. We believe we also
  identified the relevant lines of code.

  The overall setup includes l2gw - connecting the segments in the
  multisegment network. But I will ignore that here, since l2gw cannot
  be part of the root cause here. Neutron was configured with
  mechanism_drivers=sriovnicswitch,opendaylight_v2. However since the
  error happens before we bind the port, I believe the mechanism_driver
  is irrelevant as long as it allows the creation of ports with "--vnic-
  type direct --binding-profile '{"capabilities": ["switchdev"]}'". For
  the sake of simplicity I will call these "ovs sriov offload ports".

  As I understand the problem:

  1) ovs sriov offload port on single segment neutron network, the segment is vxlan: works
  2) normal port on no offload capable ovs (--vnic-type normal) on multisegment neutron network, one vlan, one vxlan segment, the port should be bound to the vxlan segment: works
  3) ovs sriov offload port on multisegment neutron network, one vlan, one vxlan segment, the port should be bound to the vxlan segment: does not work

  To reproduce:
  * create a multisegment network with one vlan and one vxlan segment
  * create a port on that network with "--vnic-type direct --binding-profile '{"capabilities": ["switchdev"]}' --disable-port-security --no-security-group".
  * boot a vm with that port

  On the compute host on which we expect the scheduling and boot to succeed we have configuration like:
  [pci]
  passthrough_whitelist = [{"devname": "data2", "physical_network": null}, {"devname": "data3", "physical_network": null}]

  According to https://docs.openstack.org/nova/latest/admin/pci-
  passthrough.html this marks the tunneled segments on this host to be
  passthrough (and ovs offload) capable.

  The vm boot fails with:

  $ openstack server show c3_ms_1
  ...
  | fault                               | {'code': 500, 'created': '2022-07-16T08:12:31Z', 'message': 'Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.', 'details': 'Traceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2418, in _build_and_run_instance\n    limits):\n  File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 360, in inner\n    return f(*args, **kwargs)\n  File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 172, in instance_claim\n    pci_requests, limits=limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 72, in __init__\n    self._claim_test(compute_node, limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 114, in _claim_test\n    "; ".join(reasons))\nnova.exception.ComputeResourcesUnavailable: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2271, in _do_build_and_run_instance\n    filter_properties, request_spec, accel_uuids)\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2469, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=e.format_message())\nnova.exception.RescheduledException: Build of instance 09f3f8bb-b4c0-4395-8167-c10609d32d08 was re-scheduled: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n'} |
  ...

  In the scheduler logs we see that the scheduler uses a spec with a
  physnet. But the pci passthrough capability is on a device without a
  physnet.

  controlhost3:/home/ceeinfra # grep DC259-CEE3- /var/log/nova/nova-scheduler.log
  <180>2022-07-16T10:12:29.680009+02:00 controlhost3.dc259cee3.cloud.k2.ericsson.se nova-scheduler[67299]: 2022-07-16 10:12:29.679 76 WARNING nova.scheduler.host_manager [req-4dd7c37e-eb18-48da-9914-44a6a2a18b1d fcd3b2713191485d95befe1941f20e20 cf7024f0f2bd46a8b17fd42055a20323 - default default] Selected host: compute3.dc259cee3.cloud.k2.ericsson.se failed to consume from instance. Error: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed: nova.exception.PciDeviceRequestFailed: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed

  We observed the bug originally on stable/victoria and found these source code lines:
  https://opendev.org/openstack/nova/src/commit/8097c2b2153ff952a266395d4e351fc39f914c6b/nova/network/neutron.py#L2128-L2135

  Here for vnic_type=direct ports we unconditionally add a physnet to
  the spec:

  From victoria to master the only recent change in that piece is this:
  https://opendev.org/openstack/nova/commit/0620678344d0f032a33e952d4d0fa653741f09e7 Add support for VNIC_REMOTE_MANAGED

  Which seems irrelevant to this bug, therefore I believe the bug could
  be reproduced on master too.

  Not having reproduced this bug myself, I will include below my
  colleague's (Angelo Nappo's) email to me, containing exact commands
  and configs. However I hope above I already provided all relevant
  information and eliminated all downstream specific details:

  The use case in legacy NFVI solution

  The system is based on VXLAN+SDN. If the VM has to reach any host outside the data center:
      a MS network is created
      l2gw connection si created, so the segments are “joined together” in the switch fabric that does the vlan to vxlan transformation. 

  For example:

  openstack network create --provider-network-type vxlan --provider-segment 2638 esohtom_ms_net_vlan_638
  openstack network segment create --network-type vxlan --network esohtom_ms_net_vlan_638 vxlan_segment
  openstack network segment create --physical-network DC259-CEE3-DCGW-NET --segment 638 --network-type vlan --network esohtom_ms_net_vlan_638 vlan_dcgw_segment

  ceeinfra@controlhost1:~> openstack network segment list

  +--------------------------------------+-------------------+--------------------------------------+--------------+---------+
  | ID                                   | Name              | Network                              | Network Type | Segment |
  +--------------------------------------+-------------------+--------------------------------------+--------------+---------+
  | 1adda535-7c37-4db3-a153-976b93649957 | vxlan_segment     | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5 | vxlan        |    2700 |
  | 29b3505c-d70b-4c24-b1b1-b0e27e64f120 | None              | cf84212f-e66b-4455-8415-d92492639f16 | flat         |    None |
  | 506c76db-4ab6-4453-9d54-63c7e89d4a34 | vlan_dcgw_segment | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5 | vlan         |     638 |
  | 578d3f85-d745-4e40-8e02-69d400c5e6c5 | None              | a5736c68-e4a4-42b2-85ae-d5134e866dae | vxlan        |    2099 |
  | 78b21ff9-29a8-4b5d-84dc-fcf00de8f697 | None              | 04e890ab-01e3-442d-8e13-8d32f25f3ff3 | vxlan        |    2374 |
  | 7b112fcd-b3d3-409c-8b83-184f4a253592 | None              | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5 | vxlan        |    2638 |
  | 8ca5d660-8eba-4024-8bc8-023f2cab2e9c | None              | b49ee119-cbc6-45f7-bc07-1b11dd602771 | vxlan        |    2441 |
  | 977147b3-3b26-4d7e-badc-721749d0cf65 | None              | d6d1f239-cfce-49d8-9cb4-a1e1305a5981 | vxlan        |    2563 |
  | 9b6369fe-dd7d-4d4b-b7eb-f0d3b263562e | None              | b922cbe8-303d-4708-ad1c-59c1ac48de58 | vxlan        |    2057 |
  | cd6f295e-cf41-4c3f-911a-5490ff790313 | None              | ed9b77ab-b225-4f63-ae7c-2e992c7f8335 | vxlan        |    2795 |
  +--------------------------------------+-------------------+--------------------------------------+--------------+---------+

  Note: The second vxlan segment (2700) is created to see if it make any
  difference but is for sure not needed and not wanted.

  openstack subnet create --network esohtom_ms_net_vlan_638 --dhcp --gateway 172.60.38.1 --allocation-pool start=172.60.38.10,end=172.60.38.100 --subnet-range 172.60.38.1/24 esohtom_vn_subnet
  openstack port create --vnic-type normal --disable-port-security --no-security-group --network esohtom_ms_net_vlan_638 norm_MS_1

  neutron l2-gateway-connection-create --default-segmentation-id 638 L2GW_PHY1_LeafCluster001 esohtom_ms_net_vlan_638
  neutron l2-gateway-connection-create --default-segmentation-id 638 L2GW_PHY0_LeafCluster001 esohtom_ms_net_vlan_638
  neutron l2-gateway-connection-create --default-segmentation-id 638 L2GW_BORDER_LeafCluster001 esohtom_ms_net_vlan_638

  ceeinfra@controlhost1:~> openstack server create --image BAT-image --flavor flavor_1 --port norm_MS_1 --availability-zone=nova:compute1.dc259cee3.cloud.k2.ericsson.se c1_ms_1
  +-------------------------------------+--------------------------------------------------+
  | Field                               | Value                                            |
  +-------------------------------------+--------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                           |
  | OS-EXT-AZ:availability_zone         | nova                                             |
  | OS-EXT-SRV-ATTR:host                | None                                             |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                             |
  | OS-EXT-SRV-ATTR:instance_name       |                                                  |
  | OS-EXT-STS:power_state              | NOSTATE                                          |
  | OS-EXT-STS:task_state               | scheduling                                       |
  | OS-EXT-STS:vm_state                 | building                                         |
  | OS-SRV-USG:launched_at              | None                                             |
  | OS-SRV-USG:terminated_at            | None                                             |
  | accessIPv4                          |                                                  |
  | accessIPv6                          |                                                  |
  | addresses                           |                                                  |
  | adminPass                           | LCnpM4wskqoL                                     |
  | config_drive                        |                                                  |
  | created                             | 2022-07-16T07:18:45Z                             |
  | flavor                              | flavor_1 (e3a09880-dd50-46dd-87be-0630d111cd00)  |
  | hostId                              |                                                  |
  | id                                  | 5fab4bbc-0f41-4f68-bb59-66100c092b2c             |
  | image                               | BAT-image (a2e36de0-765a-4735-a603-471df8983238) |
  | key_name                            | None                                             |
  | name                                | c1_ms_1                                          |
  | progress                            | 0                                                |
  | project_id                          | cf7024f0f2bd46a8b17fd42055a20323                 |
  | properties                          |                                                  |
  | security_groups                     | name='default'                                   |
  | status                              | BUILD                                            |
  | updated                             | 2022-07-16T07:18:45Z                             |
  | user_id                             | fcd3b2713191485d95befe1941f20e20                 |
  | volumes_attached                    |                                                  |
  +-------------------------------------+--------------------------------------------------+

  ceeinfra@controlhost1:~> openstack server list
  +--------------------------------------+-------------+---------+--------------------------------------------+-----------+----------+
  | ID                                   | Name        | Status  | Networks                                   | Image     | Flavor   |
  +--------------------------------------+-------------+---------+--------------------------------------------+-----------+----------+
  | 5fab4bbc-0f41-4f68-bb59-66100c092b2c | c1_ms_1     | ACTIVE  | esohtom_ms_net_vlan_638=172.60.38.63       | BAT-image | flavor_1 |
  | 5cb48a3e-36c0-4558-afc0-94834c38ba84 | c3_svf_1    | ACTIVE  | network2=50.50.50.174                      | BAT-image | flavor_1 |
  | e343d12a-3336-4075-8131-0e9552e46cb6 | c3_trex_281 | ACTIVE  | network1=40.0.0.123; network2=50.50.50.81  | trex_2_81 | flavor_3 |
  | cd492514-39a9-4b78-b1fc-9a5d5cc91191 | VM2         | ACTIVE  | trunk-net=10.10.10.222                     | BAT-image | flavor_1 |
  | 2806ff70-86ec-47a6-97a1-20e859e8e188 | VM1         | ACTIVE  | trunk-net=10.10.10.130                     | BAT-image | flavor_1 |
  | 58e47d2d-d21d-4a16-809c-67b1e0ffcbca | c4_trex_281 | ACTIVE  | network1=40.0.0.135; network2=50.50.50.252 | trex_2_81 | flavor_3 |
  | 2ba6d7ce-4092-4262-8a59-396db5a7aa0d | c4_svf_2    | ACTIVE  | network2=50.50.50.88                       | BAT-image | flavor_1 |
  | 2e434248-996b-4f3a-99fe-715384ac7e90 | c4_svf_1    | SHUTOFF | network2=50.50.50.149                      | BAT-image | flavor_1 |
  | 8990cec4-ee28-406e-93a0-a8ca2e1b95ef | c2_2        | ACTIVE  | network2=50.50.50.107                      | BAT-image | flavor_3 |
  | 59e4a587-74c9-47c5-8c07-e8f2dbb76f50 | c2_trex_281 | ACTIVE  | network1=40.0.0.216; network2=50.50.50.136 | trex_2_81 | flavor_3 |
  | a0446d12-e8e8-495f-9dca-bd3c658c2fce | c1_test     | ACTIVE  | network2=50.50.50.52                       | BAT-image | flavor_1 |
  | 7aed1ccd-2ec9-45d8-b25b-39cec598cb1b | c1_2        | ACTIVE  | network2=50.50.50.92                       | BAT-image | flavor_1 |
  | f1ff093b-60ae-4c59-a35b-3f320d99551f | c1_trex_281 | ACTIVE  | network1=40.0.0.83; network2=50.50.50.162  | trex_2_81 | flavor_3 |
  | 16c439ba-0297-4c0a-9afe-b41f4f195143 | c1_1        | ACTIVE  | network2=50.50.50.68                       | BAT-image | flavor_1 |
  | 7d5472ee-8fcc-4373-affb-a15a3bdd5e4e | c2_1        | ACTIVE  | network1=40.0.0.15; network2=50.50.50.236  | BAT-image | flavor_1 |
  +--------------------------------------+-------------+---------+--------------------------------------------+-----------+----------+

  This VM can actually ping the DC gateway. The compute where the VM is
  running uses the legacy vswitch (OVS-dpdk) with no offload
  capabilities.

  controlhost1:/home/ceeinfra # cat /etc/kolla/neutron-server/ml2_conf.ini
  [ml2]
  type_drivers = vxlan,vlan,flat
  tenant_network_types = vxlan,vlan,flat
  mechanism_drivers = sriovnicswitch,opendaylight_v2
  extension_drivers = port_security,qos
  path_mtu = 2140
  physical_network_mtus = default:2140,DC259-CEE3-PHY0:2140,DC259-CEE3-PHY1:2140,DC259-CEE3-MLAG:2140,DC259-CEE3-DCGW-NET:2140,DC259-CEE3-MLAG_LEFT:2140,DC259-CEE3-MLAG_RIGHT:2140

  [ml2_type_vlan]
  network_vlan_ranges = DC259-CEE3-PHY0,DC259-CEE3-PHY1,DC259-CEE3-MLAG,DC259-CEE3-DCGW-NET,DC259-CEE3-MLAG_LEFT,DC259-CEE3-MLAG_RIGHT

  [ml2_type_flat]
  flat_networks = DC259-CEE3-PHY0,DC259-CEE3-PHY1,DC259-CEE3-MLAG

  [ml2_type_vxlan]
  vni_ranges = 2001:2999

  [securitygroup]
  firewall_driver = neutron.agent.firewall.NoopFirewallDriver

  [ml2_sdi]

  [ml2_bsp]

  [ml2_odl]

  controlhost1:/home/ceeinfra # cat /etc/kolla/neutron-server/ml2_conf_odl.ini
  [ml2_odl]
  url = redacted
  username = redacted
  password = redacted
  enable_dhcp_service = False
  enable_full_sync = false
  port_binding_controller = "pseudo-agentdb-binding"
  enable_websocket_pseudo_agentdb = False
  odl_features = "operational-port-status"

  # scheduler config
  [filter_scheduler]
  enabled_filters = AggregateMultiTenancyIsolation,AvailabilityZoneFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,AggregateInstanceExtraSpecsFilter,SameHostFilter,DifferentHostFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,PciPassthroughFilter,NUMATopologyFilter

  The use case in NFVI solution including smartNIC (OVS kernel datapath
  offload to smart VFs)

  Now I create the neutron port representing a smart VF in the MS
  network, for which I have the necessary capabilities in compute3 and
  4:

  ceeinfra@controlhost1:~> openstack port create --vnic-type direct --vnic-type=direct --binding-profile='{"capabilities": ["switchdev"]}' --disable-port-security --no-security-group --network esohtom_ms_net_vlan_638 SVF_MS_1
  +-------------------------+-----------------------------------------------------------------------------+
  | Field                   | Value                                                                       |
  +-------------------------+-----------------------------------------------------------------------------+
  | admin_state_up          | UP                                                                          |
  | allowed_address_pairs   |                                                                             |
  | binding_host_id         |                                                                             |
  | binding_profile         | capabilities='['switchdev']'                                                |
  | binding_vif_details     |                                                                             |
  | binding_vif_type        | unbound                                                                     |
  | binding_vnic_type       | direct                                                                      |
  | created_at              | 2022-07-16T08:04:57Z                                                        |
  | data_plane_status       | None                                                                        |
  | description             |                                                                             |
  | device_id               |                                                                             |
  | device_owner            |                                                                             |
  | dns_assignment          | None                                                                        |
  | dns_domain              | None                                                                        |
  | dns_name                | None                                                                        |
  | extra_dhcp_opts         |                                                                             |
  | fixed_ips               | ip_address='172.60.38.94', subnet_id='1800e806-6d24-4c45-a586-935a7ba1d1c5' |
  | id                      | e79a0e66-debd-42e8-a46d-9cdc29a7c960                                        |
  | ip_allocation           | immediate                                                                   |
  | mac_address             | fa:16:3e:99:4a:3e                                                           |
  | name                    | SVF_MS_1                                                                    |
  | network_id              | 10084fdd-cb84-4ed8-af28-a5b3a79c5bd5                                        |
  | numa_affinity_policy    | None                                                                        |
  | port_security_enabled   | False                                                                       |
  | project_id              | cf7024f0f2bd46a8b17fd42055a20323                                            |
  | propagate_uplink_status | None                                                                        |
  | qos_network_policy_id   | None                                                                        |
  | qos_policy_id           | None                                                                        |
  | resource_request        | None                                                                        |
  | revision_number         | 1                                                                           |
  | security_group_ids      |                                                                             |
  | status                  | DOWN                                                                        |
  | tags                    |                                                                             |
  | trunk_details           | None                                                                        |
  | updated_at              | 2022-07-16T08:04:57Z                                                        |
  +-------------------------+-----------------------------------------------------------------------------+

  On those computes the nova-compute configuration includes:

  [pci]
  passthrough_whitelist = [{"devname": "data2", "physical_network": null}, {"devname": "data3", "physical_network": null}]

  Please note that according to
  https://docs.openstack.org/nova/latest/admin/pci-passthrough.html we
  use an empty physical network when we want to address all the NICs
  that carry an overlay network.

  This works perfectly in case of server creation with smartVF on single
  segment neutron networks vxlan type.

  Now I try to create the server with the smart VF on compute3 on the MS
  network:

  ceeinfra@controlhost1:~> openstack server create --image BAT-image --flavor flavor_1 --port SVF_MS_1 --availability-zone=nova:compute3.dc259cee3.cloud.k2.ericsson.se c3_ms_1
  +-------------------------------------+--------------------------------------------------+
  | Field                               | Value                                            |
  +-------------------------------------+--------------------------------------------------+
  | OS-DCF:diskConfig                   | MANUAL                                           |
  | OS-EXT-AZ:availability_zone         | nova                                             |
  | OS-EXT-SRV-ATTR:host                | None                                             |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None                                             |
  | OS-EXT-SRV-ATTR:instance_name       |                                                  |
  | OS-EXT-STS:power_state              | NOSTATE                                          |
  | OS-EXT-STS:task_state               | scheduling                                       |
  | OS-EXT-STS:vm_state                 | building                                         |
  | OS-SRV-USG:launched_at              | None                                             |
  | OS-SRV-USG:terminated_at            | None                                             |
  | accessIPv4                          |                                                  |
  | accessIPv6                          |                                                  |
  | addresses                           |                                                  |
  | adminPass                           | DR6AcaeH85nt                                     |
  | config_drive                        |                                                  |
  | created                             | 2022-07-16T08:12:28Z                             |
  | flavor                              | flavor_1 (e3a09880-dd50-46dd-87be-0630d111cd00)  |
  | hostId                              |                                                  |
  | id                                  | 09f3f8bb-b4c0-4395-8167-c10609d32d08             |
  | image                               | BAT-image (a2e36de0-765a-4735-a603-471df8983238) |
  | key_name                            | None                                             |
  | name                                | c3_ms_1                                          |
  | progress                            | 0                                                |
  | project_id                          | cf7024f0f2bd46a8b17fd42055a20323                 |
  | properties                          |                                                  |
  | security_groups                     | name='default'                                   |
  | status                              | BUILD                                            |
  | updated                             | 2022-07-16T08:12:28Z                             |
  | user_id                             | fcd3b2713191485d95befe1941f20e20                 |
  | volumes_attached                    |                                                  |
  +-------------------------------------+--------------------------------------------------+

  ceeinfra@controlhost1:~> openstack server show c3_ms_1
  ...
  | fault                               | {'code': 500, 'created': '2022-07-16T08:12:31Z', 'message': 'Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.', 'details': 'Traceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2418, in _build_and_run_instance\n    limits):\n  File "/usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 360, in inner\n    return f(*args, **kwargs)\n  File "/usr/lib/python3.6/site-packages/nova/compute/resource_tracker.py", line 172, in instance_claim\n    pci_requests, limits=limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 72, in __init__\n    self._claim_test(compute_node, limits)\n  File "/usr/lib/python3.6/site-packages/nova/compute/claims.py", line 114, in _claim_test\n    "; ".join(reasons))\nnova.exception.ComputeResourcesUnavailable: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2271, in _do_build_and_run_instance\n    filter_properties, request_spec, accel_uuids)\n  File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 2469, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=e.format_message())\nnova.exception.RescheduledException: Build of instance 09f3f8bb-b4c0-4395-8167-c10609d32d08 was re-scheduled: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology; Claim pci failed.\n'} |
  ...

  The scheduler cannot find the needed PCI devices with the needed
  capabilities on compute3…but what are the PCI devices is it looking
  for?

  controlhost3:/home/ceeinfra # grep DC259-CEE3- /var/log/nova/nova-scheduler.log
  <180>2022-07-16T10:12:29.680009+02:00 controlhost3.dc259cee3.cloud.k2.ericsson.se nova-scheduler[67299]: 2022-07-16 10:12:29.679 76 WARNING nova.scheduler.host_manager [req-4dd7c37e-eb18-48da-9914-44a6a2a18b1d fcd3b2713191485d95befe1941f20e20 cf7024f0f2bd46a8b17fd42055a20323 - default default] Selected host: compute3.dc259cee3.cloud.k2.ericsson.se failed to consume from instance. Error: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed: nova.exception.PciDeviceRequestFailed: PCI device request [InstancePCIRequest(alias_name=None,count=1,is_new=<?>,numa_policy=None,request_id=a5644948-3b13-4cca-a98a-7780ee4d2157,requester_id='e79a0e66-debd-42e8-a46d-9cdc29a7c960',spec=[{physical_network='DC259-CEE3-DCGW-NET'}])] failed

  It cannot find any physical network associated to the vxlan segments
  and then goes to the next segment that is a vlan segment, that has a
  physical network, but DC259-CEE3-DCGW-NET is for sure not able to
  provide the required PCI capabilities according to the nova-compute
  passthrough_whitelist. The same happens if I skip to create explicitly
  the vxlan segment 2700.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1983570/+subscriptions