yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #75224
[Bug 1797333] [NEW] Instances are locked and unable to start after server crash (queens)
Public bug reported:
After restarting crashed host, disks of hosted instances on nfs are
locked and cannot be restarted:
libvirtError: internal error: process exited while connecting to monitor: 2018-10-10T10:16:09.816477Z qemu-system-x86_64: -drive file=/var/lib/nova/instances/ed7760a8-3008-4feb-83f3-3b753b0e7d6e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none: Failed to get "write" lock
ERROR nova.compute.manager [instance: ed7760a8-3008-4feb-83f3-3b753b0e7d6e] Is another process using the image?
The same situation occurs on other compute nodes connected to the same
shared file system, after evacuate instances. So it seems that disks are
locked by libvirt in an unknown, undocumented way. As workaround I had
to make copy of all failed instances, delete their disk files and
restore them from copy. After that instances started successfully.
If there is other solution to unlock those instance disks, please share.
** Affects: nova
Importance: Undecided
Status: New
** Tags: libvirt
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1797333
Title:
Instances are locked and unable to start after server crash (queens)
Status in OpenStack Compute (nova):
New
Bug description:
After restarting crashed host, disks of hosted instances on nfs are
locked and cannot be restarted:
libvirtError: internal error: process exited while connecting to monitor: 2018-10-10T10:16:09.816477Z qemu-system-x86_64: -drive file=/var/lib/nova/instances/ed7760a8-3008-4feb-83f3-3b753b0e7d6e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none: Failed to get "write" lock
ERROR nova.compute.manager [instance: ed7760a8-3008-4feb-83f3-3b753b0e7d6e] Is another process using the image?
The same situation occurs on other compute nodes connected to the same
shared file system, after evacuate instances. So it seems that disks
are locked by libvirt in an unknown, undocumented way. As workaround I
had to make copy of all failed instances, delete their disk files and
restore them from copy. After that instances started successfully.
If there is other solution to unlock those instance disks, please
share.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1797333/+subscriptions
Follow ups