yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #13084
[Bug 1251792] Re: infinite recursion when deleting an instance with no network interfaces
** Changed in: nova/havana
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1251792
Title:
infinite recursion when deleting an instance with no network
interfaces
Status in OpenStack Compute (Nova):
Fix Released
Status in OpenStack Compute (nova) havana series:
Fix Released
Bug description:
In some situations when an instance has "no network information" (a
phrase that I'm using loosely), deleting the instance results in
infinite recursion. The stack looks like this:
2013-11-15 18:50:28.995 DEBUG nova.network.neutronv2.api [req-28f48294-0877-4f09-bcc1-7595dbd4c15a demo demo] File "/usr/lib/python2.7/dist-packages/eventlet/greenpool.py", line 80, in _spawn_n_impl
func(*args, **kwargs)
File "/opt/stack/nova/nova/openstack/common/rpc/amqp.py", line 461, in _process_data
**args)
File "/opt/stack/nova/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch
result = getattr(proxyobj, method)(ctxt, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 354, in decorated_function
return function(self, context, *args, **kwargs)
File "/opt/stack/nova/nova/exception.py", line 73, in wrapped
return f(self, context, *args, **kw)
File "/opt/stack/nova/nova/compute/manager.py", line 230, in decorated_function
return function(self, context, *args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 295, in decorated_function
function(self, context, *args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 259, in decorated_function
return function(self, context, *args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 1984, in terminate_instance
do_terminate_instance(instance, bdms)
File "/opt/stack/nova/nova/openstack/common/lockutils.py", line 248, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 1976, in do_terminate_instance
reservations=reservations)
File "/opt/stack/nova/nova/hooks.py", line 105, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 1919, in _delete_instance
self._shutdown_instance(context, db_inst, bdms)
File "/opt/stack/nova/nova/compute/manager.py", line 1829, in _shutdown_instance
network_info = self._get_instance_nw_info(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 868, in _get_instance_nw_info
instance)
File "/opt/stack/nova/nova/network/neutronv2/api.py", line 449, in get_instance_nw_info
result = self._get_instance_nw_info(context, instance, networks)
File "/opt/stack/nova/nova/network/api.py", line 64, in wrapper
nw_info=res)
RECURSION STARTS HERE
File "/opt/stack/nova/nova/network/api.py", line 77, in update_instance_cache_with_nw_info
nw_info = api._get_instance_nw_info(context, instance)
File "/opt/stack/nova/nova/network/api.py", line 64, in wrapper
nw_info=res)
... REPEATS AD NAUSEUM ...
File "/opt/stack/nova/nova/network/api.py", line 77, in update_instance_cache_with_nw_info
nw_info = api._get_instance_nw_info(context, instance)
File "/opt/stack/nova/nova/network/api.py", line 64, in wrapper
nw_info=res)
File "/opt/stack/nova/nova/network/api.py", line 77, in update_instance_cache_with_nw_info
nw_info = api._get_instance_nw_info(context, instance)
File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper
res = f(self, context, *args, **kwargs)
File "/opt/stack/nova/nova/network/neutronv2/api.py", line 459, in _get_instance_nw_info
LOG.debug('%s', ''.join(traceback.format_stack()))
Here's a step-by-step explanation of how the infinite recursion
arises:
1. somebody calls nova.network.neutronv2.api.API.get_instance_nw_info
2. in the above call, the network info is successfully retrieved as
result = self._get_instance_nw_info(context, instance, networks)
3. however, since the instance has "no network information", result is
the empty list (i.e., [])
4. the result is put in the cache by calling
nova.network.api.update_instance_cache_with_nw_info
5. update_instance_cache_with_nw_info is supposed to add the result to
the cache, but due to a bug in update_instance_cache_with_nw_info, it
recursively calls api.get_instance_nw_info, which brings us back to
step 1. The bug is the check before the recursive call:
if not nw_info:
nw_info = api._get_instance_nw_info(context, instance)
which erroneously equates [] and None. Hence the check should be "if
nw_info is None:"
I should clarify that the instance _did_ have network information at
some point (i.e., I booted it normally with a NIC), however, some time
after I issued a "nova delete" request, the network information was
gone (i.e., in nova list, the networks column was empty for the
instance while it was in the deleting task state).
I came across this problem when doing performance testing with the
latest openstack code (i.e., the master branches as of this morning of
all of the github.com/openstack/* projects).
There's an outstanding max recursion issue
(https://bugs.launchpad.net/nova/+bug/1251778) that could very well be
caused by this bug in update_instance_cache_with_nw_info. Note that in
the bug report you don't see repeated calls to
update_instance_cache_with_nw_info because LOG.exception only shows
the stack trace from the try: frame to the exceptional frame, whereas
I used traceback.format_stack() which prints everything.
Although The fix is simple enough, I'm not going to fire off a review
immediately because I haven't put much thought in how to test it.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1251792/+subscriptions