yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #46074
[Bug 1470437] Re: ImageCacheManager raises Permission denied error on nova compute in race condition
Reviewed: https://review.openstack.org/185549
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ec9d5e375e208686d33b9259b039cc009bded42e
Submitter: Jenkins
Branch: master
commit ec9d5e375e208686d33b9259b039cc009bded42e
Author: Ankit Agrawal <ankit11.agrawal@xxxxxxxxxxx>
Date: Mon Aug 10 16:27:57 2015 +1000
libvirt: Race condition leads to instance in error
ImageCacheManager deletes base image while image backend is copying
image to the instance path leading instance to go in the error state.
Acquired lock before removing image from cache. If libvirt is copying
image to the instance path, image cache manager won't be able to remove
it until libvirt finishes copying image completely.
Closes-Bug: 1256838
Closes-Bug: 1470437
Co-Authored-By: Michael Still <mikal@xxxxxxxxxxx>
Depends-On: I337ce28e2fc516c91bec61ca3639ebff0029ad49
Change-Id: I376cc951922c338669fdf3f83da83e0d3cea1532
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1470437
Title:
ImageCacheManager raises Permission denied error on nova compute in
race condition
Status in OpenStack Compute (nova):
Fix Released
Bug description:
ImageCacheManager raises Permission denied error on nova compute in
race condition
While creating an instance snapshot nova calls guest.launch method
from libvirt driver which changes the base file permissions and
updates base file user from openstack to libvirt-qemu (in case of
qcow2 image backend). In race condition when ImageCacheManager is
trying to update last access time of this base file and guest.launch
is called by instance snapshot just before updating the access time,
ImageCacheManager raise Permission denied error in nova compute for
os.utime().
Steps to reproduce:
1. Configure image_cache_manager_interval=120 in nova.conf and use qcow2 image backend.
2. Add a sleep for 60 sec in _handle_base_image method of libvirt.imagecache just before calling os.utime().
3. Restart nova services.
4. Create an instance using image.
$ nova boot --image 5e1659aa-6d38-44e8-aaa3-4217337436c0 --flavor 1 instance-1
5. Check that instance is in active state.
6. Go to the n-cpu screen and check imagecache manager logs at the point it waits to execute sleep statement added in step #2.
7. Send instance snapshot request when imagecache manger is waiting to execute sleep.
$ nova image-create 19c7900b-73d5-4c2e-b129-5e2a6b13f396 instance-1-snap
8. instance snapshot request updates the base file owner to libvirt-qemu by calling guest.launch method from libvirt driver.
9. Now when imagecache manger comes out from sleep and executes os.utime it raise following Permission denied error in nova compute.
2015-07-01 01:51:46.794 ERROR nova.openstack.common.periodic_task [req-a03fa45f-ffb9-48dd-8937-5b0414c6864b None None] Error during ComputeManager._run_image_cache_manager_pass
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task Traceback(most recent call last):
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/openstack/common/periodic_task.py", line 224, in run_periodic_tasks
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task task(self, context)
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/compute/manager.py", line 6177, in _run_image_cache_manager_pass
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self.driver.manage_image_cache(context, filtered_instances)
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6252, in manage_image_cache
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self.image_cache_manager.update(context, all_instances)
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 668, in update
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task
self._age_and_verify_cached_images(context, all_instances, base_dir)
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 598, in _age_and_verify_cached_images
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self._handle_base_image(img, base_file)
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 570, in _handle_base_image
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task os.utime(base_file, None)
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task OSError:[Errno 13] Permission denied: '/opt/stack/data/nova/instances/_base/8d2c340dcce68e48a75457b1e91457feed27aef5'
2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task
Expected result: guest.launch should not update the base file
permissions and owner to libvirt-qemu. Base file owner should remain
unchanged.
Actual result: Libvirt is updating the base file owner which causes
permission issues in nova.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1470437/+subscriptions
References