yahoo-eng-team team mailing list archive

Thread
Date
[Bug 2033247] [NEW] PCI Leaks when multiple detach operations performed in parallel

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Amit Gupta <2033247@xxxxxxxxxxxxxxxxxx>
Date: Mon, 28 Aug 2023 07:42:59 -0000
Reply-to: Bug 2033247 <2033247@xxxxxxxxxxxxxxxxxx>
Sender: noreply@xxxxxxxxxxxxx
Public bug reported:

We are using the Openstack yoga release and we need to attach/detach the
port to a VM dynamically. We are observing the PCI leaks while doing
multiple detach operations simultaneously.

PCI starts leaking when one of the openstack table “instance_extra” is
exhausted. Once this table is exhausted then openstack is not able to
attach/detach a port and it starts leaking PCIs due to exception as it
can’t perform any action. This table is used by Openstack to record the
all historical records of “PCIRequest” for the all interfaces attached
with a VM.

DBDataError (pymysql.err.DataError) (1406, "Data too long for column 'pci_requests' at row 1")
[SQL: UPDATE instance_extra SET updated_at=%(updated_at)s, device_metadata=%(device_metadata)s, numa_topology=%(numa_topology)s, pci_requests=%(pci_requests)s, flavor=%(flavor)s WHERE instance_extra.deleted = %(deleted_1)s AND instance_extra.instance_uuid = %(instance_uuid_1)s]
[parameters: {'updated_at': datetime.datetime(2023, 8, 3, 14, 39, 56, 116791), 'device_metadata': '{"nova_object.name": "InstanceDeviceMetadata", "nova_object.namespace": "nova", "nova_object.version": "1.0", "nova_object.data": {"devices": [{"nova ... (6168 characters truncated) ... :00:14.0"}, "nova_object.changes": ["address"]}}, "nova_object.changes": ["bus", "vf_trusted", "mac", "vlan"]}]}, "nova_object.changes": ["devices"]}', 'numa_topology': '{"nova_object.name": "InstanceNUMATopology", "nova_object.namespace": "nova", "nova_object.version": "1.3", "nova_object.data": {"cells": [{"nova_obj ... (946 characters truncated) ... nges": ["id", "cpu_pinning_raw", "cpuset_reserved"]}], "emulator_threads_policy": null}, "nova_object.changes": ["emulator_threads_policy", "cells"]}', 'pci_requests': '[{"count": 1, "spec": [{"physical_network": "sriov3", "remote_managed": "False"}], "alias_name": null, "is_new": false, "numa_policy": null, "request ... (65464 characters truncated) ...  "is_new": false, "numa_policy": null, "request_id": "2d5ef4bd-d499-4e62-a617-75ed4535c930", "requester_id": "f4fabf3b-ccc1-4117-bb30-de53a9a55d66"}]', 'flavor': '{"cur": {"nova_object.name": "Flavor", "nova_object.namespace": "nova", "nova_object.version": "1.2", "nova_object.data": {"id": 71, "name": "SOLTEST ... (481 characters truncated) ... "2023-07-04T05:45:02Z", "updated_at": null, "deleted_at": null, "deleted": false}, "nova_object.changes": ["extra_specs"]}, "old": null, "new": null}', 'deleted_1': 0, 'instance_uuid_1': 'dd5d0568-1aad-47ed-8418-78a6c75363cc'}] 

I validated the count of “PCIRequestRecord” stored in “'pci_requests’”
field of table “instance_extra” and I found that 260 records are stored
which is somewhat equivalent to the number of attach operations
performed on this node before we started seeing the PCI leak as reported
by Mohit in his mail below. This also indicate that “PciRequest” record
for ports created via operator is not getting cleaned up even after that
port is detached/deleted


I was suspecting that this is another issue in openstack when openstack handling the parallel detach requests as received from the operator. I did one exercise to prove that . In my test case, I was having one pod with 2 sriov vnics and I performed the attach/detach operations in a loop then we were hitting the PCI leak issue after multiple attached/detached operations. My hunch was that openstack is responding to a detach request immediately while a detachment of that interface is not completed in backend and nova service is unable to handle another detached operation for sriov ports when other is still in progress which leads to the backed up of one of the openstack table. There was barely a gap between 2 successive detached request sent to the openstack as you can see that we operator is sending the detached request immediately just after the first detached.

To prove this, we introduced a delay of 10 seconds in our code to serialize the detached operations to avoid a possibility where a detached operation is still pending with openstack services for the previous detach. After that I am successfully able to execute the following test cases
•	600 attach/detach operations for a single pod with 2 sriov vnics
•	400 attach/detach operations for 4 pods; each with 2 sriov vnics. 
•	320 attach/detach operations for 8 pods; each with 2 sriov vnics. 

Therefore, we did ~1300 vnic attach/detach operations and I don’t see
any leak with these changes. The PCI Pool is completely available after
that many attach/detach operations

This proves that openstack is not able to handle the simultaneous detach
operations in yoga release.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2033247

Title:
  PCI Leaks when multiple detach operations performed in parallel

Status in OpenStack Compute (nova):
  New

Bug description:
  We are using the Openstack yoga release and we need to attach/detach
  the port to a VM dynamically. We are observing the PCI leaks while
  doing multiple detach operations simultaneously.

  PCI starts leaking when one of the openstack table “instance_extra” is
  exhausted. Once this table is exhausted then openstack is not able to
  attach/detach a port and it starts leaking PCIs due to exception as it
  can’t perform any action. This table is used by Openstack to record
  the all historical records of “PCIRequest” for the all interfaces
  attached with a VM.

  DBDataError (pymysql.err.DataError) (1406, "Data too long for column 'pci_requests' at row 1")
  [SQL: UPDATE instance_extra SET updated_at=%(updated_at)s, device_metadata=%(device_metadata)s, numa_topology=%(numa_topology)s, pci_requests=%(pci_requests)s, flavor=%(flavor)s WHERE instance_extra.deleted = %(deleted_1)s AND instance_extra.instance_uuid = %(instance_uuid_1)s]
  [parameters: {'updated_at': datetime.datetime(2023, 8, 3, 14, 39, 56, 116791), 'device_metadata': '{"nova_object.name": "InstanceDeviceMetadata", "nova_object.namespace": "nova", "nova_object.version": "1.0", "nova_object.data": {"devices": [{"nova ... (6168 characters truncated) ... :00:14.0"}, "nova_object.changes": ["address"]}}, "nova_object.changes": ["bus", "vf_trusted", "mac", "vlan"]}]}, "nova_object.changes": ["devices"]}', 'numa_topology': '{"nova_object.name": "InstanceNUMATopology", "nova_object.namespace": "nova", "nova_object.version": "1.3", "nova_object.data": {"cells": [{"nova_obj ... (946 characters truncated) ... nges": ["id", "cpu_pinning_raw", "cpuset_reserved"]}], "emulator_threads_policy": null}, "nova_object.changes": ["emulator_threads_policy", "cells"]}', 'pci_requests': '[{"count": 1, "spec": [{"physical_network": "sriov3", "remote_managed": "False"}], "alias_name": null, "is_new": false, "numa_policy": null, "request ... (65464 characters truncated) ...  "is_new": false, "numa_policy": null, "request_id": "2d5ef4bd-d499-4e62-a617-75ed4535c930", "requester_id": "f4fabf3b-ccc1-4117-bb30-de53a9a55d66"}]', 'flavor': '{"cur": {"nova_object.name": "Flavor", "nova_object.namespace": "nova", "nova_object.version": "1.2", "nova_object.data": {"id": 71, "name": "SOLTEST ... (481 characters truncated) ... "2023-07-04T05:45:02Z", "updated_at": null, "deleted_at": null, "deleted": false}, "nova_object.changes": ["extra_specs"]}, "old": null, "new": null}', 'deleted_1': 0, 'instance_uuid_1': 'dd5d0568-1aad-47ed-8418-78a6c75363cc'}] 

  I validated the count of “PCIRequestRecord” stored in “'pci_requests’”
  field of table “instance_extra” and I found that 260 records are
  stored which is somewhat equivalent to the number of attach operations
  performed on this node before we started seeing the PCI leak as
  reported by Mohit in his mail below. This also indicate that
  “PciRequest” record for ports created via operator is not getting
  cleaned up even after that port is detached/deleted

  
  I was suspecting that this is another issue in openstack when openstack handling the parallel detach requests as received from the operator. I did one exercise to prove that . In my test case, I was having one pod with 2 sriov vnics and I performed the attach/detach operations in a loop then we were hitting the PCI leak issue after multiple attached/detached operations. My hunch was that openstack is responding to a detach request immediately while a detachment of that interface is not completed in backend and nova service is unable to handle another detached operation for sriov ports when other is still in progress which leads to the backed up of one of the openstack table. There was barely a gap between 2 successive detached request sent to the openstack as you can see that we operator is sending the detached request immediately just after the first detached.

  To prove this, we introduced a delay of 10 seconds in our code to serialize the detached operations to avoid a possibility where a detached operation is still pending with openstack services for the previous detach. After that I am successfully able to execute the following test cases
  •	600 attach/detach operations for a single pod with 2 sriov vnics
  •	400 attach/detach operations for 4 pods; each with 2 sriov vnics. 
  •	320 attach/detach operations for 8 pods; each with 2 sriov vnics. 

  Therefore, we did ~1300 vnic attach/detach operations and I don’t see
  any leak with these changes. The PCI Pool is completely available
  after that many attach/detach operations

  This proves that openstack is not able to handle the simultaneous
  detach operations in yoga release.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2033247/+subscriptions