yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #60178
[Bug 1653810] [NEW] [sriov] Modifying or removing pci_passthrough_whitelist may result in inconsistent VF availability
Public bug reported:
OpenStack Version: v14 (Newton)
NIC: Mellanox ConnectX-3 Pro
While testing an SR-IOV implementation, we found that
pci_passthrough_whitelist in nova.conf is involved in the population of
the pci_devices table in the Nova DB. Making changes to the
device/interface in the whitelist or commenting out the line altogether,
and restarting nova-compute, can result in the entries being marked as
'deleted' in the database. Reconfiguring the pci_passthrough_whitelist
option with the same device/interface will result in new entries being
created and marked as 'available'. This can cause PCI device claim
issues if an existing instance is still running and using a VF and
another instance is booted using a 'direct' port.
In the following table, you can see the original implementation that
includes an allocated VF. During testing, we commented out the
pci_passthrough_whitelist line in nova.conf, and restarted nova-compute.
The entries were marked as 'deleted', though the running instance was
not deleted and continued to function. The pci_passthrough_whitelist
config was then returned and nova-compute restarted. New entries were
created and marked as 'available':
MariaDB [nova]> select * from pci_devices;
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
| created_at | updated_at | deleted_at | deleted | id | compute_node_id | address | product_id | vendor_id | dev_type | dev_id | label | status | extra_info | instance_uuid | request_id | numa_node | parent_addr |
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 72 | 72 | 6 | 0000:07:00.0 | 1007 | 15b3 | type-PF | pci_0000_07_00_0 | label_15b3_1007 | unavailable | {} | NULL | NULL | 0 | NULL |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:43:23 | 75 | 75 | 6 | 0000:07:00.1 | 1004 | 15b3 | type-VF | pci_0000_07_00_1 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 78 | 78 | 6 | 0000:07:00.2 | 1004 | 15b3 | type-VF | pci_0000_07_00_2 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:44:25 | 81 | 81 | 6 | 0000:07:00.3 | 1004 | 15b3 | type-VF | pci_0000_07_00_3 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 84 | 84 | 6 | 0000:07:00.4 | 1004 | 15b3 | type-VF | pci_0000_07_00_4 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:43:23 | 87 | 87 | 6 | 0000:07:00.5 | 1004 | 15b3 | type-VF | pci_0000_07_00_5 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 90 | 90 | 6 | 0000:07:00.6 | 1004 | 15b3 | type-VF | pci_0000_07_00_6 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:44:51 | 93 | 93 | 6 | 0000:07:00.7 | 1004 | 15b3 | type-VF | pci_0000_07_00_7 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 17:40:25 | 2016-12-29 20:42:26 | 96 | 96 | 6 | 0000:07:01.0 | 1004 | 15b3 | type-VF | pci_0000_07_01_0 | label_15b3_1004 | allocated | {} | 178c733b-fb6a-4c97-b1e5-cdc14aae2e0d | b8d79a88-5918-4a38-b2fb-de97a263c70e | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 231 | 6 | 0000:07:00.0 | 1007 | 15b3 | type-PF | pci_0000_07_00_0 | label_15b3_1007 | available | {} | NULL | NULL | 0 | NULL |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 234 | 6 | 0000:07:00.1 | 1004 | 15b3 | type-VF | pci_0000_07_00_1 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 237 | 6 | 0000:07:00.2 | 1004 | 15b3 | type-VF | pci_0000_07_00_2 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 240 | 6 | 0000:07:00.3 | 1004 | 15b3 | type-VF | pci_0000_07_00_3 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 243 | 6 | 0000:07:00.4 | 1004 | 15b3 | type-VF | pci_0000_07_00_4 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 246 | 6 | 0000:07:00.5 | 1004 | 15b3 | type-VF | pci_0000_07_00_5 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 249 | 6 | 0000:07:00.6 | 1004 | 15b3 | type-VF | pci_0000_07_00_6 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:38 | NULL | NULL | 0 | 252 | 6 | 0000:07:01.0 | 1004 | 15b3 | type-VF | pci_0000_07_01_0 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
A new instance was then booted using a new 'direct' port. The instance
was marked in an ERROR state with the following error:
2017-01-03 16:10:10.513 12103 ERROR nova.compute.manager [instance:
ad961a72-198f-4e3d-8ce0-c157668a44d6] libvirtError: Requested operation
is not valid: PCI device 0000:07:01.0 is in use by driver QEMU, domain
instance-0000007e
Instance instance-0000007e corresponds to the instance UUID in the DB,
178c733b-fb6a-4c97-b1e5-cdc14aae2e0d. The interface can be seen here:
root@compute01:# ip link show ens1d1
22: ens1d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br-vlan portid e41d2d03005b6213 state UP mode DEFAULT group default qlen 1000
link/ether e4:1d:2d:5b:62:13 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 1 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 2 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 3 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 4 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 5 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 6 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 7 MAC fa:16:3e:27:bd:90, vlan 50, spoof checking on, link-state enable
No attempt was made to provision a different VF, or to re-populate the
entries in pci_devices based on the existing VF allocation on the host.
I'm not sure what the expected action was meant to be in this
circumstance, if any.
A similar bug was reported at:
https://bugs.launchpad.net/nova/+bug/1633120
Please let me know if you need any additional info.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1653810
Title:
[sriov] Modifying or removing pci_passthrough_whitelist may result in
inconsistent VF availability
Status in OpenStack Compute (nova):
New
Bug description:
OpenStack Version: v14 (Newton)
NIC: Mellanox ConnectX-3 Pro
While testing an SR-IOV implementation, we found that
pci_passthrough_whitelist in nova.conf is involved in the population
of the pci_devices table in the Nova DB. Making changes to the
device/interface in the whitelist or commenting out the line
altogether, and restarting nova-compute, can result in the entries
being marked as 'deleted' in the database. Reconfiguring the
pci_passthrough_whitelist option with the same device/interface will
result in new entries being created and marked as 'available'. This
can cause PCI device claim issues if an existing instance is still
running and using a VF and another instance is booted using a 'direct'
port.
In the following table, you can see the original implementation that
includes an allocated VF. During testing, we commented out the
pci_passthrough_whitelist line in nova.conf, and restarted nova-
compute. The entries were marked as 'deleted', though the running
instance was not deleted and continued to function. The
pci_passthrough_whitelist config was then returned and nova-compute
restarted. New entries were created and marked as 'available':
MariaDB [nova]> select * from pci_devices;
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
| created_at | updated_at | deleted_at | deleted | id | compute_node_id | address | product_id | vendor_id | dev_type | dev_id | label | status | extra_info | instance_uuid | request_id | numa_node | parent_addr |
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 72 | 72 | 6 | 0000:07:00.0 | 1007 | 15b3 | type-PF | pci_0000_07_00_0 | label_15b3_1007 | unavailable | {} | NULL | NULL | 0 | NULL |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:43:23 | 75 | 75 | 6 | 0000:07:00.1 | 1004 | 15b3 | type-VF | pci_0000_07_00_1 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 78 | 78 | 6 | 0000:07:00.2 | 1004 | 15b3 | type-VF | pci_0000_07_00_2 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:44:25 | 81 | 81 | 6 | 0000:07:00.3 | 1004 | 15b3 | type-VF | pci_0000_07_00_3 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 84 | 84 | 6 | 0000:07:00.4 | 1004 | 15b3 | type-VF | pci_0000_07_00_4 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:43:23 | 87 | 87 | 6 | 0000:07:00.5 | 1004 | 15b3 | type-VF | pci_0000_07_00_5 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:42:26 | 90 | 90 | 6 | 0000:07:00.6 | 1004 | 15b3 | type-VF | pci_0000_07_00_6 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 20:40:34 | 2016-12-29 20:44:51 | 93 | 93 | 6 | 0000:07:00.7 | 1004 | 15b3 | type-VF | pci_0000_07_00_7 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2016-12-29 15:23:36 | 2016-12-29 17:40:25 | 2016-12-29 20:42:26 | 96 | 96 | 6 | 0000:07:01.0 | 1004 | 15b3 | type-VF | pci_0000_07_01_0 | label_15b3_1004 | allocated | {} | 178c733b-fb6a-4c97-b1e5-cdc14aae2e0d | b8d79a88-5918-4a38-b2fb-de97a263c70e | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 231 | 6 | 0000:07:00.0 | 1007 | 15b3 | type-PF | pci_0000_07_00_0 | label_15b3_1007 | available | {} | NULL | NULL | 0 | NULL |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 234 | 6 | 0000:07:00.1 | 1004 | 15b3 | type-VF | pci_0000_07_00_1 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 237 | 6 | 0000:07:00.2 | 1004 | 15b3 | type-VF | pci_0000_07_00_2 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 240 | 6 | 0000:07:00.3 | 1004 | 15b3 | type-VF | pci_0000_07_00_3 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 243 | 6 | 0000:07:00.4 | 1004 | 15b3 | type-VF | pci_0000_07_00_4 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 246 | 6 | 0000:07:00.5 | 1004 | 15b3 | type-VF | pci_0000_07_00_5 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:37 | NULL | NULL | 0 | 249 | 6 | 0000:07:00.6 | 1004 | 15b3 | type-VF | pci_0000_07_00_6 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
| 2017-01-03 22:23:38 | NULL | NULL | 0 | 252 | 6 | 0000:07:01.0 | 1004 | 15b3 | type-VF | pci_0000_07_01_0 | label_15b3_1004 | available | {} | NULL | NULL | 0 | 0000:07:00.0 |
+---------------------+---------------------+---------------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-------------+------------+--------------------------------------+--------------------------------------+-----------+--------------+
A new instance was then booted using a new 'direct' port. The instance
was marked in an ERROR state with the following error:
2017-01-03 16:10:10.513 12103 ERROR nova.compute.manager [instance:
ad961a72-198f-4e3d-8ce0-c157668a44d6] libvirtError: Requested
operation is not valid: PCI device 0000:07:01.0 is in use by driver
QEMU, domain instance-0000007e
Instance instance-0000007e corresponds to the instance UUID in the DB,
178c733b-fb6a-4c97-b1e5-cdc14aae2e0d. The interface can be seen here:
root@compute01:# ip link show ens1d1
22: ens1d1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br-vlan portid e41d2d03005b6213 state UP mode DEFAULT group default qlen 1000
link/ether e4:1d:2d:5b:62:13 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 1 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 2 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 3 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 4 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 5 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 6 MAC 00:00:00:00:00:00, vlan 4095, spoof checking off, link-state auto
vf 7 MAC fa:16:3e:27:bd:90, vlan 50, spoof checking on, link-state enable
No attempt was made to provision a different VF, or to re-populate the
entries in pci_devices based on the existing VF allocation on the
host. I'm not sure what the expected action was meant to be in this
circumstance, if any.
A similar bug was reported at:
https://bugs.launchpad.net/nova/+bug/1633120
Please let me know if you need any additional info.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1653810/+subscriptions
Follow ups