yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #58427
[Bug 1639914] [NEW] Race condition in nova compute during snapshot
Public bug reported:
On Liberty nova with Ceph storage, when snapshot is created and
immediately deleting the instance seems to cause race condition.
This can be created with following commands.
1. nova boot --flavor m1.large --image 6d4259ce-5873-42cb-8cbe-
9873f069c149 testinstance
id | bef22f9b-
ade4-48a1-86c4-b9a007897eb3
2. nova image-create bef22f9b-ade4-48a1-86c4-b9a007897eb3 testinstance-snap ; nova delete bef22f9b-ade4-48a1-86c4-b9a007897eb3
Request to delete server bef22f9b-ade4-48a1-86c4-b9a007897eb3 has been accepted.
3. nova image-list doesn't show the snapshot
4. nova list doesn't show the instance
Nova compute log indicates a race condition while executing CLI commands
in 2 above
<182>1 2016-10-28T14:46:41.830208+00:00 hyper1 nova-compute 30056 - [40521 levelname="INFO" component="nova-compute" funcname="nova.compute.manager" request_id="req-e9e4e899-e2a7-4bf8-bdf1-c26f5634cfda" user="51fa0172fbdf495e89132f7f4574e750" tenant="00ead348c5f9475f8940ab29cd767c5e" instance="[instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] " lineno="/usr/lib/python2.7/site-packages/nova/compute/manager.py:2249"] nova.compute.manager Terminating instance
<183>1 2016-10-28T14:46:42.057653+00:00 hyper1 nova-compute 30056 - [40521 levelname="DEBUG" component="nova-compute" funcname="nova.compute.manager" request_id="req-1c4cf749-a6a8-46af-b331-f70dc1e9f364" user="51fa0172fbdf495e89132f7f4574e750" tenant="00ead348c5f9475f8940ab29cd767c5e" instance="[instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] " lineno="/usr/lib/python2.7/site-packages/nova/compute/manager.py:420"] nova.compute.manager Cleaning up image ae9ebf4b-7dd6-4615-816f-c2f3c7c08530 decorated_function /usr/lib/python2.7/site-packages/nova/compute/manager.py:420
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] Traceback (most recent call last):
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 416, in decorated_function
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] *args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3038, in snapshot_instance
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] task_states.IMAGE_SNAPSHOT)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3068, in _snapshot_instance
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] update_task_state)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1447, in snapshot
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] guest.save_memory_state()
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 363, in save_memory_state
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] self._domain.managedSave(0)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] result = proxy_call(self._autowrap, f, *args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] rv = execute(f, *args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] six.reraise(c, e, tb)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] rv = meth(*args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1397, in managedSave
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] if ret == -1: raise libvirtError ('virDomainManagedSave() failed', dom=self)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] libvirtError: operation failed: domain is no longer running
Nova compute should make sure the save is completed before attempting to
delete the domain.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1639914
Title:
Race condition in nova compute during snapshot
Status in OpenStack Compute (nova):
New
Bug description:
On Liberty nova with Ceph storage, when snapshot is created and
immediately deleting the instance seems to cause race condition.
This can be created with following commands.
1. nova boot --flavor m1.large --image 6d4259ce-5873-42cb-8cbe-
9873f069c149 testinstance
id | bef22f9b-
ade4-48a1-86c4-b9a007897eb3
2. nova image-create bef22f9b-ade4-48a1-86c4-b9a007897eb3 testinstance-snap ; nova delete bef22f9b-ade4-48a1-86c4-b9a007897eb3
Request to delete server bef22f9b-ade4-48a1-86c4-b9a007897eb3 has been accepted.
3. nova image-list doesn't show the snapshot
4. nova list doesn't show the instance
Nova compute log indicates a race condition while executing CLI
commands in 2 above
<182>1 2016-10-28T14:46:41.830208+00:00 hyper1 nova-compute 30056 - [40521 levelname="INFO" component="nova-compute" funcname="nova.compute.manager" request_id="req-e9e4e899-e2a7-4bf8-bdf1-c26f5634cfda" user="51fa0172fbdf495e89132f7f4574e750" tenant="00ead348c5f9475f8940ab29cd767c5e" instance="[instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] " lineno="/usr/lib/python2.7/site-packages/nova/compute/manager.py:2249"] nova.compute.manager Terminating instance
<183>1 2016-10-28T14:46:42.057653+00:00 hyper1 nova-compute 30056 - [40521 levelname="DEBUG" component="nova-compute" funcname="nova.compute.manager" request_id="req-1c4cf749-a6a8-46af-b331-f70dc1e9f364" user="51fa0172fbdf495e89132f7f4574e750" tenant="00ead348c5f9475f8940ab29cd767c5e" instance="[instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] " lineno="/usr/lib/python2.7/site-packages/nova/compute/manager.py:420"] nova.compute.manager Cleaning up image ae9ebf4b-7dd6-4615-816f-c2f3c7c08530 decorated_function /usr/lib/python2.7/site-packages/nova/compute/manager.py:420
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] Traceback (most recent call last):
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 416, in decorated_function
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] *args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3038, in snapshot_instance
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] task_states.IMAGE_SNAPSHOT)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3068, in _snapshot_instance
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] update_task_state)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1447, in snapshot
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] guest.save_memory_state()
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 363, in save_memory_state
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] self._domain.managedSave(0)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] result = proxy_call(self._autowrap, f, *args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] rv = execute(f, *args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] six.reraise(c, e, tb)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] rv = meth(*args, **kwargs)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1397, in managedSave
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] if ret == -1: raise libvirtError ('virDomainManagedSave() failed', dom=self)
!!!NL!!! 30056 TRACE nova.compute.manager [instance: bef22f9b-ade4-48a1-86c4-b9a007897eb3] libvirtError: operation failed: domain is no longer running
Nova compute should make sure the save is completed before attempting
to delete the domain.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1639914/+subscriptions
Follow ups