← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1565721] [NEW] SR-IOV PF passthrough breaks resource tracking

 

Public bug reported:

Enable PCI passthrough on a compute host (whitelist devices explained in
more detail in the docs), and create a network, subnet and a port  that
represents a SR-IOV physical function passthrough:

$ neutron net-create --provider:physical_network=phynet --provider:network_type=flat sriov-net
$ neutron subnet-create sriov-net 192.168.2.0/24 --name sriov-subne
$ neutron port-create sriov-net --binding:vnic_type=direct-physical --name pf

After that try to boot an instance using the created port (provided the
pci_passthrough_whitelist was setup correctly) this should work:

$ boot --image xxx --flavor 1 --nic port-id=$PORT_ABOVE testvm

however, the next resource tracker run fails with:

2016-04-04 11:25:34.663 ERROR nova.compute.manager [req-d8095318-9710-48a8-a054-4581641c3bf3 None None] Error updating resources for node kilmainham-ghost.
2016-04-04 11:25:34.663 TRACE nova.compute.manager Traceback (most recent call last):
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", line 6442, in update_available_resource_for_node
2016-04-04 11:25:34.663 TRACE nova.compute.manager     rt.update_available_resource(context)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 458, in update_available_resource
2016-04-04 11:25:34.663 TRACE nova.compute.manager     self._update_available_resource(context, resources)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
2016-04-04 11:25:34.663 TRACE nova.compute.manager     return f(*args, **kwargs)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 493, in _update_available_resource
2016-04-04 11:25:34.663 TRACE nova.compute.manager     self.pci_tracker.update_devices_from_hypervisor_resources(dev_json)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/pci/manager.py", line 118, in update_devices_from_hypervisor_resources
2016-04-04 11:25:34.663 TRACE nova.compute.manager     self._set_hvdevs(devices)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/pci/manager.py", line 141, in _set_hvdevs
2016-04-04 11:25:34.663 TRACE nova.compute.manager     self.stats.remove_device(existed)
2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/pci/stats.py", line 138, in remove_device
2016-04-04 11:25:34.663 TRACE nova.compute.manager     pool['devices'].remove(dev)
2016-04-04 11:25:34.663 TRACE nova.compute.manager ValueError: list.remove(x): x not in list

Which basically kills the RT periodic run meaning no further resources
get updated by the periodic task.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1565721

Title:
  SR-IOV PF passthrough breaks resource tracking

Status in OpenStack Compute (nova):
  New

Bug description:
  Enable PCI passthrough on a compute host (whitelist devices explained
  in more detail in the docs), and create a network, subnet and a port
  that represents a SR-IOV physical function passthrough:

  $ neutron net-create --provider:physical_network=phynet --provider:network_type=flat sriov-net
  $ neutron subnet-create sriov-net 192.168.2.0/24 --name sriov-subne
  $ neutron port-create sriov-net --binding:vnic_type=direct-physical --name pf

  After that try to boot an instance using the created port (provided
  the pci_passthrough_whitelist was setup correctly) this should work:

  $ boot --image xxx --flavor 1 --nic port-id=$PORT_ABOVE testvm

  however, the next resource tracker run fails with:

  2016-04-04 11:25:34.663 ERROR nova.compute.manager [req-d8095318-9710-48a8-a054-4581641c3bf3 None None] Error updating resources for node kilmainham-ghost.
  2016-04-04 11:25:34.663 TRACE nova.compute.manager Traceback (most recent call last):
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", line 6442, in update_available_resource_for_node
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     rt.update_available_resource(context)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 458, in update_available_resource
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     self._update_available_resource(context, resources)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     return f(*args, **kwargs)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 493, in _update_available_resource
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     self.pci_tracker.update_devices_from_hypervisor_resources(dev_json)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/pci/manager.py", line 118, in update_devices_from_hypervisor_resources
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     self._set_hvdevs(devices)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/pci/manager.py", line 141, in _set_hvdevs
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     self.stats.remove_device(existed)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager   File "/opt/stack/nova/nova/pci/stats.py", line 138, in remove_device
  2016-04-04 11:25:34.663 TRACE nova.compute.manager     pool['devices'].remove(dev)
  2016-04-04 11:25:34.663 TRACE nova.compute.manager ValueError: list.remove(x): x not in list

  Which basically kills the RT periodic run meaning no further resources
  get updated by the periodic task.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1565721/+subscriptions