← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1896741] [NEW] Intel mediated device info doesn't provide a name attribute

 

Public bug reported:

When testing some Xeon server for virtual GPU support, I saw that Nova
provides an exception as the i915 driver doesn't provide a name for mdev
types :

Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager Traceback (most recent call last):
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", line 9824, in _update_available_resource_for_node
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     startup=startup)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 896, in update_available_resource
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     return f(*args, **kwargs)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 981, in _update_available_resource
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self._update(context, cn, startup=startup)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1233, in _update
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self._update_to_placement(context, compute_node, startup)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 49, in wrapped_f
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     return Retrying(*dargs, **dkw).call(f, *args, **kw)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 206, in call
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     return attempt.get(self._wrap_exception)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 247, in get
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     six.reraise(self.value[0], self.value[1], self.value[2])
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     raise value
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 200, in call
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1169, in _update_to_placement
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self.driver.update_provider_tree(prov_tree, nodename)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7857, in update_provider_tree
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     provider_tree, nodename, allocations=allocations)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8250, in _update_provider_tree_for_vgpu
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     inventories_dict = self._get_gpu_inventories()
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7028, in _get_gpu_inventories
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     count_per_dev = self._count_mdev_capable_devices(enabled_vgpu_types)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6984, in _count_mdev_capable_devices
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     types=enabled_vgpu_types)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7268, in _get_mdev_capable_devices
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     device = self._get_mdev_capabilities_for_dev(name, types)
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7253, in _get_mdev_capabilities_for_dev
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     'name': cap['name'],
Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager KeyError: 'name'


For example :

[root@mymachine ~]# ll /sys/class/mdev_bus/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_8/
total 0
-r--r--r--. 1 root root 4096 Sep 22 14:18 available_instances
--w-------. 1 root root 4096 Sep 23 06:01 create
-r--r--r--. 1 root root 4096 Sep 23 05:43 description
-r--r--r--. 1 root root 4096 Sep 22 14:18 device_api
drwxr-xr-x. 2 root root    0 Sep 23 06:01 devices

When looking at the kernel driver API documentation
https://www.kernel.org/doc/html/latest/driver-api/vfio-mediated-
device.html it says that the "name" attribute is optional:

"name

This attribute should show human readable name. This is optional
attribute."


The fix should be easy, we don't use this attribute in Nova.

** Affects: nova
     Importance: Low
     Assignee: Sylvain Bauza (sylvain-bauza)
         Status: Triaged


** Tags: libvirt vgpu

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1896741

Title:
  Intel mediated device info doesn't provide a name attribute

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  When testing some Xeon server for virtual GPU support, I saw that Nova
  provides an exception as the i915 driver doesn't provide a name for
  mdev types :

  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager Traceback (most recent call last):
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", line 9824, in _update_available_resource_for_node
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     startup=startup)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 896, in update_available_resource
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self._update_available_resource(context, resources, startup=startup)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     return f(*args, **kwargs)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 981, in _update_available_resource
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self._update(context, cn, startup=startup)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1233, in _update
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self._update_to_placement(context, compute_node, startup)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 49, in wrapped_f
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     return Retrying(*dargs, **dkw).call(f, *args, **kw)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 206, in call
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     return attempt.get(self._wrap_exception)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 247, in get
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     six.reraise(self.value[0], self.value[1], self.value[2])
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     raise value
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/usr/local/lib/python3.7/site-packages/retrying.py", line 200, in call
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1169, in _update_to_placement
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     self.driver.update_provider_tree(prov_tree, nodename)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7857, in update_provider_tree
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     provider_tree, nodename, allocations=allocations)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 8250, in _update_provider_tree_for_vgpu
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     inventories_dict = self._get_gpu_inventories()
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7028, in _get_gpu_inventories
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     count_per_dev = self._count_mdev_capable_devices(enabled_vgpu_types)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6984, in _count_mdev_capable_devices
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     types=enabled_vgpu_types)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7268, in _get_mdev_capable_devices
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     device = self._get_mdev_capabilities_for_dev(name, types)
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 7253, in _get_mdev_capabilities_for_dev
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager     'name': cap['name'],
  Sep 23 06:00:19 mymachine.redhat.com nova-compute[195458]: ERROR nova.compute.manager KeyError: 'name'

  
  For example :

  [root@mymachine ~]# ll /sys/class/mdev_bus/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_8/
  total 0
  -r--r--r--. 1 root root 4096 Sep 22 14:18 available_instances
  --w-------. 1 root root 4096 Sep 23 06:01 create
  -r--r--r--. 1 root root 4096 Sep 23 05:43 description
  -r--r--r--. 1 root root 4096 Sep 22 14:18 device_api
  drwxr-xr-x. 2 root root    0 Sep 23 06:01 devices

  When looking at the kernel driver API documentation
  https://www.kernel.org/doc/html/latest/driver-api/vfio-mediated-
  device.html it says that the "name" attribute is optional:

  "name

  This attribute should show human readable name. This is optional
  attribute."

  
  The fix should be easy, we don't use this attribute in Nova.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1896741/+subscriptions


Follow ups