yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #88234
[Bug 1960230] Re: resize fails with FileExistsError if earlier resize attempt failed to cleanup
Reviewed: https://review.opendev.org/c/openstack/nova/+/827865
Committed: https://opendev.org/openstack/nova/commit/9111b99f739d41c092db8d01712a5aa72388b5fb
Submitter: "Zuul (22348)"
Branch: master
commit 9111b99f739d41c092db8d01712a5aa72388b5fb
Author: Tobias Urdin <tobias.urdin@xxxxxxxxxx>
Date: Fri Feb 4 15:01:36 2022 +0100
Cleanup old resize instances dir before resize
If there is a failed resize that also failed the cleanup
process performed by _cleanup_remote_migration() the retry
of the resize will fail because it cannot rename the current
instances directory to _resize.
This renames the _cleanup_failed_migration() that does the
same logic we want to _cleanup_failed_instance_base() and
uses it for both migration and resize cleanup of directory.
It then simply calls _cleanup_failed_instances_base() with
the resize dir path before trying a resize.
Closes-Bug: 1960230
Change-Id: I7412b16be310632da59a6139df9f0913281b5d77
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1960230
Title:
resize fails with FileExistsError if earlier resize attempt failed to
cleanup
Status in OpenStack Compute (nova):
Fix Released
Bug description:
This bug is related to resize with the libvirt driver
If you are performing a resize and it fails the
_cleanup_remote_migration() [1] function in the libvirt driver will
try to cleanup the /var/lib/nova/instances/<uuid>_resize directory on
the remote side [2] - if this fails the <uuid>_resize directory will
be left behind and block any future resize attempts.
2021-12-14 14:40:12.535 175177 INFO nova.virt.libvirt.driver
[req-9d3477d4-3bb2-456f-9be6-dce9893b0e95
23d6aa8884ab44ef9f214ad195d273c0 050c556faa5944a8953126c867313770 -
default default] [instance: 99287438-c37b-44b0-834e-55685b6e83eb]
Deletion of
/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb_resize
failed
Then on next resize attempt a long time later
2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 10429, in migrate_disk_and_power_off
2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server os.rename(inst_base, inst_base_resize)
2022-02-04 13:07:31.255 175177 ERROR oslo_messaging.rpc.server FileExistsError: [Errno 17] File exists: '/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb' -> '/var/lib/nova/instances/99287438-c37b-44b0-834e-55685b6e83eb_resize'
This is happens here [3] because os.rename tries to rename the
/var/lib/nova/instances/<uuid> dir to <uuid>_resize that already
exists and fails with FileExistsError.
We should check if the directory exists before trying to rename and
delete it before.
[1] https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10773
[2] https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10965
[3] https://opendev.org/openstack/nova/src/branch/stable/xena/nova/virt/libvirt/driver.py#L10915
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1960230/+subscriptions
References