← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1940641] [NEW] nova compute with allocated vgpu device failed to start after host reboot

 

Public bug reported:

Description
=====================

nova compute service failed to start after reboot, if there are vgpu
virtual machines beforehand.

Error log

2021-08-20 09:37:30.331 284159 DEBUG nova.virt.libvirt.volume.mount [None req-6ad4e06c-980e-4759-8b36-6c696e596dab - - - - -] Initialising _HostMountState generation 0 host_up /var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/volume/mount.py:131
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service [-] Error starting thread.: libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_74527849_d08c_4243_b868_f84a1437c9b5'
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service Traceback (most recent call last):
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/oslo_service/service.py", line 807, in run_service
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     service.start()
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/service.py", line 159, in start
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     self.manager.init_host()
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/compute/manager.py", line 1414, in init_host
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     self.driver.init_host(host=self.host)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 733, in init_host
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     self._recreate_assigned_mediated_devices()
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 862, in _recreate_assigned_mediated_devices
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     dev_info = self._get_mediated_device_information(dev_name)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 7380, in _get_mediated_device_information
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     virtdev = self._host.device_lookup_by_name(devname)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/host.py", line 1153, in device_lookup_by_name
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     return self.get_connection().nodeDeviceLookupByName(name)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 190, in doit
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     result = proxy_call(self._autowrap, f, *args, **kwargs)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 148, in proxy_call
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     rv = execute(f, *args, **kwargs)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 129, in execute
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     six.reraise(c, e, tb)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/six.py", line 703, in reraise
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     raise value
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 83, in tworker
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     rv = meth(*args, **kwargs)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/libvirt.py", line 4614, in nodeDeviceLookupByName
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_74527849_d08c_4243_b868_f84a1437c9b5'
2021-08-20 09:37:30.421 284159 ERROR oslo_service.service


Environment
============

nova: victoria
os ubuntu 20.04

Steps to Reproduce
===================


create vgpu virtual machines (mdev) and then reboot host.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1940641

Title:
  nova compute with allocated vgpu device failed to start after host
  reboot

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  =====================

  nova compute service failed to start after reboot, if there are vgpu
  virtual machines beforehand.

  Error log

  2021-08-20 09:37:30.331 284159 DEBUG nova.virt.libvirt.volume.mount [None req-6ad4e06c-980e-4759-8b36-6c696e596dab - - - - -] Initialising _HostMountState generation 0 host_up /var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/volume/mount.py:131
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service [-] Error starting thread.: libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_74527849_d08c_4243_b868_f84a1437c9b5'
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service Traceback (most recent call last):
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/oslo_service/service.py", line 807, in run_service
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     service.start()
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/service.py", line 159, in start
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     self.manager.init_host()
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/compute/manager.py", line 1414, in init_host
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     self.driver.init_host(host=self.host)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 733, in init_host
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     self._recreate_assigned_mediated_devices()
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 862, in _recreate_assigned_mediated_devices
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     dev_info = self._get_mediated_device_information(dev_name)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 7380, in _get_mediated_device_information
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     virtdev = self._host.device_lookup_by_name(devname)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/nova/virt/libvirt/host.py", line 1153, in device_lookup_by_name
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     return self.get_connection().nodeDeviceLookupByName(name)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 190, in doit
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     result = proxy_call(self._autowrap, f, *args, **kwargs)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 148, in proxy_call
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     rv = execute(f, *args, **kwargs)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 129, in execute
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     six.reraise(c, e, tb)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/six.py", line 703, in reraise
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     raise value
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/tpool.py", line 83, in tworker
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     rv = meth(*args, **kwargs)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service   File "/var/lib/openstack/lib/python3.8/site-packages/libvirt.py", line 4614, in nodeDeviceLookupByName
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service     if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service libvirt.libvirtError: Node device not found: no node device with matching name 'mdev_74527849_d08c_4243_b868_f84a1437c9b5'
  2021-08-20 09:37:30.421 284159 ERROR oslo_service.service

  
  Environment
  ============

  nova: victoria
  os ubuntu 20.04

  Steps to Reproduce
  ===================

  
  create vgpu virtual machines (mdev) and then reboot host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1940641/+subscriptions