← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1251792] Re: infinite recursion when deleting an instance with no network interfaces

 

stable/havana was mistakenly marked as released while merged patch 58471 only had Related-bug: 1251792
https://review.openstack.org/57042 needs to be backported to fix it in Havana.

** Changed in: nova/havana
    Milestone: 2013.2.1 => None

** Changed in: nova/havana
       Status: Fix Released => New

** Changed in: nova/havana
     Assignee: Armando Migliaccio (armando-migliaccio) => (unassigned)

** Changed in: nova/havana
       Status: New => Confirmed

** Changed in: nova/havana
     Assignee: (unassigned) => Aaron Rosen (arosen)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1251792

Title:
  infinite recursion when deleting an instance with no network
  interfaces

Status in OpenStack Compute (Nova):
  Fix Released
Status in OpenStack Compute (nova) havana series:
  Confirmed

Bug description:
  In some situations when an instance has "no network information" (a
  phrase that I'm using loosely), deleting the instance results in
  infinite recursion. The stack looks like this:

  2013-11-15 18:50:28.995 DEBUG nova.network.neutronv2.api [req-28f48294-0877-4f09-bcc1-7595dbd4c15a demo demo]   File "/usr/lib/python2.7/dist-packages/eventlet/greenpool.py", line 80, in _spawn_n_impl
      func(*args, **kwargs)
    File "/opt/stack/nova/nova/openstack/common/rpc/amqp.py", line 461, in _process_data
      **args)
    File "/opt/stack/nova/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch
      result = getattr(proxyobj, method)(ctxt, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 354, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/exception.py", line 73, in wrapped
      return f(self, context, *args, **kw)
    File "/opt/stack/nova/nova/compute/manager.py", line 230, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 295, in decorated_function
      function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 259, in decorated_function
      return function(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 1984, in terminate_instance
      do_terminate_instance(instance, bdms)
    File "/opt/stack/nova/nova/openstack/common/lockutils.py", line 248, in inner
      return f(*args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 1976, in do_terminate_instance
      reservations=reservations)
    File "/opt/stack/nova/nova/hooks.py", line 105, in inner
      rv = f(*args, **kwargs)
    File "/opt/stack/nova/nova/compute/manager.py", line 1919, in _delete_instance
      self._shutdown_instance(context, db_inst, bdms)
    File "/opt/stack/nova/nova/compute/manager.py", line 1829, in _shutdown_instance
      network_info = self._get_instance_nw_info(context, instance)
    File "/opt/stack/nova/nova/compute/manager.py", line 868, in _get_instance_nw_info
      instance)
    File "/opt/stack/nova/nova/network/neutronv2/api.py", line 449, in get_instance_nw_info
      result = self._get_instance_nw_info(context, instance, networks)
    File "/opt/stack/nova/nova/network/api.py", line 64, in wrapper
      nw_info=res)

  RECURSION STARTS HERE

    File "/opt/stack/nova/nova/network/api.py", line 77, in update_instance_cache_with_nw_info
      nw_info = api._get_instance_nw_info(context, instance)
    File "/opt/stack/nova/nova/network/api.py", line 64, in wrapper
      nw_info=res)

  ... REPEATS AD NAUSEUM ...

    File "/opt/stack/nova/nova/network/api.py", line 77, in update_instance_cache_with_nw_info
      nw_info = api._get_instance_nw_info(context, instance)
    File "/opt/stack/nova/nova/network/api.py", line 64, in wrapper
      nw_info=res)
    File "/opt/stack/nova/nova/network/api.py", line 77, in update_instance_cache_with_nw_info
      nw_info = api._get_instance_nw_info(context, instance)
    File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper
      res = f(self, context, *args, **kwargs)
    File "/opt/stack/nova/nova/network/neutronv2/api.py", line 459, in _get_instance_nw_info
      LOG.debug('%s', ''.join(traceback.format_stack()))

  Here's a step-by-step explanation of how the infinite recursion
  arises:

  1. somebody calls nova.network.neutronv2.api.API.get_instance_nw_info

  2. in the above call, the network info is successfully retrieved as
  result = self._get_instance_nw_info(context, instance, networks)

  3. however, since the instance has "no network information", result is
  the empty list (i.e., [])

  4. the result is put in the cache by calling
  nova.network.api.update_instance_cache_with_nw_info

  5. update_instance_cache_with_nw_info is supposed to add the result to
  the cache, but due to a bug in update_instance_cache_with_nw_info, it
  recursively calls api.get_instance_nw_info, which brings us back to
  step 1. The bug is the check before the recursive call:

      if not nw_info:
          nw_info = api._get_instance_nw_info(context, instance)

  which erroneously equates [] and None. Hence the check should be "if
  nw_info is None:"

  I should clarify that the instance _did_ have network information at
  some point (i.e., I booted it normally with a NIC), however, some time
  after I issued a "nova delete" request, the network information was
  gone (i.e., in nova list, the networks column was empty for the
  instance while it was in the deleting task state).

  I came across this problem when doing performance testing with the
  latest openstack code (i.e., the master branches as of this morning of
  all of the github.com/openstack/* projects).

  There's an outstanding max recursion issue
  (https://bugs.launchpad.net/nova/+bug/1251778) that could very well be
  caused by this bug in update_instance_cache_with_nw_info. Note that in
  the bug report you don't see repeated calls to
  update_instance_cache_with_nw_info because LOG.exception only shows
  the stack trace from the try: frame to the exceptional frame, whereas
  I used traceback.format_stack() which prints everything.

  Although The fix is simple enough,  I'm not going to fire off a review
  immediately because I haven't put much thought in how to test it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1251792/+subscriptions