yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #84205
[Bug 1900800] [NEW] VGPUs is not recreated on host reboot
Public bug reported:
Description
===========
In Ussuri, when a compute node providing vGPUs (Nvidia GRID in my case) is rebooted, the mdevs for VGPUs is not recreated, and a traceback from libvirt.libvirtError is thrown.
https://paste.ubuntu.com/p/4t4NvTHGd8/
As far as I understand, this should have been fixed in
https://review.opendev.org/#/c/715489/ but it seems like it fails even
before it tries to recreate the mdev.
Expected result
===============
Upon host reboot, the mdevs should be recreated and the VMs should be restarted.
Actual result
=============
nova-compute throws the aforementioned error, the mdevs are not re-created and the VMs is left in an unrecoverable state.
Environment
===========
# dnf list installed | grep nova
openstack-nova-common.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri
openstack-nova-compute.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri
python3-nova.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri
python3-novaclient.noarch 1:17.0.0-1.el8 @centos-openstack-ussuri
# dnf list installed | grep libvirt
libvirt-bash-completion.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-client.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-config-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-interface.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-network.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-nodedev.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-qemu.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-secret.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-core.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-disk.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-gluster.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-iscsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-iscsi-direct.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-logical.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-mpath.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-rbd.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-scsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-kvm.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-libs.x86_64 6.0.0-25.2.el8 @advanced-virtualization
python3-libvirt.x86_64 6.0.0-1.el8 @advanced-virtualization
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1900800
Title:
VGPUs is not recreated on host reboot
Status in OpenStack Compute (nova):
New
Bug description:
Description
===========
In Ussuri, when a compute node providing vGPUs (Nvidia GRID in my case) is rebooted, the mdevs for VGPUs is not recreated, and a traceback from libvirt.libvirtError is thrown.
https://paste.ubuntu.com/p/4t4NvTHGd8/
As far as I understand, this should have been fixed in
https://review.opendev.org/#/c/715489/ but it seems like it fails even
before it tries to recreate the mdev.
Expected result
===============
Upon host reboot, the mdevs should be recreated and the VMs should be restarted.
Actual result
=============
nova-compute throws the aforementioned error, the mdevs are not re-created and the VMs is left in an unrecoverable state.
Environment
===========
# dnf list installed | grep nova
openstack-nova-common.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri
openstack-nova-compute.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri
python3-nova.noarch 1:21.1.0-2.el8 @centos-openstack-ussuri
python3-novaclient.noarch 1:17.0.0-1.el8 @centos-openstack-ussuri
# dnf list installed | grep libvirt
libvirt-bash-completion.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-client.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-config-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-interface.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-network.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-nodedev.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-nwfilter.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-qemu.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-secret.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-core.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-disk.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-gluster.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-iscsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-iscsi-direct.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-logical.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-mpath.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-rbd.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-driver-storage-scsi.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-daemon-kvm.x86_64 6.0.0-25.2.el8 @advanced-virtualization
libvirt-libs.x86_64 6.0.0-25.2.el8 @advanced-virtualization
python3-libvirt.x86_64 6.0.0-1.el8 @advanced-virtualization
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1900800/+subscriptions