← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1304968] Re: Nova cpu full of instance_info_cache stack traces due to attempting to send events about deleted instances

 

** Also affects: nova/icehouse
   Importance: Undecided
       Status: New

** Changed in: nova/icehouse
    Milestone: None => 2014.1.3

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1304968

Title:
  Nova cpu full of instance_info_cache stack traces due to attempting to
  send events about deleted instances

Status in OpenStack Compute (Nova):
  Fix Released
Status in OpenStack Compute (nova) icehouse series:
  New

Bug description:
  The bulk of the stack traces in n-cpu is because emit_event is getting
  triggered on a VM delete, however by the time we get to emit_event the
  instance is deleted (we see this exception 183 times in this log -
  which means it's happening on *every* compute terminate) so when we
  try to look up the instance we hit the exception found here:

      @base.remotable_classmethod
      def get_by_instance_uuid(cls, context, instance_uuid):
          db_obj = db.instance_info_cache_get(context, instance_uuid)
          if not db_obj:
              raise exception.InstanceInfoCacheNotFound(
                      instance_uuid=instance_uuid)
          return InstanceInfoCache._from_db_object(context, cls(), db_obj)

  A log trace of this interaction looks like this:

  
  2014-04-08 11:14:25.475 DEBUG nova.openstack.common.lockutils [req-fe9db989-416e-4da0-986c-e68336e3c602 TenantUsagesTestJSON-153098759 TenantUsagesTestJSON-953946497] Semaphore / lock released "do_terminate_instance" inner /opt/stack/new/nova/nova/openstack/common/lockutils.py:252
  2014-04-08 11:14:25.907 DEBUG nova.openstack.common.lockutils [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Got semaphore "75da98d7-bbd5-42a2-ad6f-7a66e38977fa" lock /opt/stack/new/nova/nova/openstack/common/lockutils.py:168
  2014-04-08 11:14:25.907 DEBUG nova.openstack.common.lockutils [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Got semaphore / lock "do_terminate_instance" inner /opt/stack/new/nova/nova/openstack/common/lockutils.py:248
  2014-04-08 11:14:25.907 DEBUG nova.openstack.common.lockutils [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Got semaphore "<function _lock_name at 0x41635f0>" lock /opt/stack/new/nova/nova/openstack/common/lockutils.py:168
  2014-04-08 11:14:25.908 DEBUG nova.openstack.common.lockutils [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Got semaphore / lock "_clear_events" inner /opt/stack/new/nova/nova/openstack/common/lockutils.py:248
  2014-04-08 11:14:25.908 DEBUG nova.openstack.common.lockutils [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Semaphore / lock released "_clear_events" inner /opt/stack/new/nova/nova/openstack/common/lockutils.py:252
  2014-04-08 11:14:25.928 AUDIT nova.compute.manager [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] [instance: 75da98d7-bbd5-42a2-ad6f-7a66e38977fa] Terminating instance
  2014-04-08 11:14:25.989 DEBUG nova.objects.instance [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Lazy-loading `system_metadata' on Instance uuid 75da98d7-bbd5-42a2-ad6f-7a66e38977fa obj_load_attr /opt/stack/new/nova/nova/objects/instance.py:519
  2014-04-08 11:14:26.209 DEBUG nova.network.api [req-687de0cf-67fc-434f-927f-4c37665ad5d8 FixedIPsTestJson-234831436 FixedIPsTestJson-1960919997] Updating cache with info: [VIF({'ovs_interfaceid': None, 'network': Network({'bridge': u'br100', 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': u'fixed', 'floating_ips': [], 'address': u'10.1.0.2'})], 'version': 4, 'meta': {u'dhcp_server': u'10.1.0.1'}, 'dns': [IP({'meta': {}, 'version': 4, 'type': u'dns', 'address': u'8.8.4.4'})], 'routes': [], 'cidr': u'10.1.0.0/24', 'gateway': IP({'meta': {}, 'version': 4, 'type': u'gateway', 'address': u'10.1.0.1'})}), Subnet({'ips': [], 'version': None, 'meta': {u'dhcp_server': None}, 'dns': [], 'routes': [], 'cidr': None, 'gateway': IP({'meta': {}, 'version': None, 'type': u'gateway', 'address': None})})], 'meta': {u'tenant_id': None, u'should_create_bridge': True, u'bridge_interface': u'eth0'}, 'id': u'9751787e-f41c-4299-be13-941c901f6d18', 'label': u'private'}), 'devname': None, 'qbh_params': None, 'meta': {}, 'details': {}, 'address': u'fa:16:3e:d8:87:38', 'active': False, 'type': u'bridge', 'id': u'db1ac48d-805a-45d3-9bb9-786bb5855673', 'qbg_params': None})] update_instance_cache_with_nw_info /opt/stack/new/nova/nova/network/api.py:74
  2014-04-08 11:14:27.661 2894 DEBUG nova.virt.driver [-] Emitting event <nova.virt.event.LifecycleEvent object at 0x4932e50> emit_event /opt/stack/new/nova/nova/virt/driver.py:1207
  2014-04-08 11:14:27.661 2894 INFO nova.compute.manager [-] Lifecycle event 1 on VM 75da98d7-bbd5-42a2-ad6f-7a66e38977fa
  2014-04-08 11:14:27.773 2894 ERROR nova.virt.driver [-] Exception dispatching event <nova.virt.event.LifecycleEvent object at 0x4932e50>: Info cache for instance 75da98d7-bbd5-42a2-ad6f-7a66e38977fa could not be found.
  Traceback (most recent call last):

    File "/opt/stack/new/nova/nova/conductor/manager.py", line 597, in _object_dispatch
      return getattr(target, method)(context, *args, **kwargs)

    File "/opt/stack/new/nova/nova/objects/base.py", line 151, in wrapper
      return fn(self, ctxt, *args, **kwargs)

    File "/opt/stack/new/nova/nova/objects/instance.py", line 500, in refresh
      self.info_cache.refresh()

    File "/opt/stack/new/nova/nova/objects/base.py", line 151, in wrapper
      return fn(self, ctxt, *args, **kwargs)

    File "/opt/stack/new/nova/nova/objects/instance_info_cache.py", line 103, in refresh
      self.instance_uuid)

    File "/opt/stack/new/nova/nova/objects/base.py", line 112, in wrapper
      result = fn(cls, context, *args, **kwargs)

    File "/opt/stack/new/nova/nova/objects/instance_info_cache.py", line 70, in get_by_instance_uuid
      instance_uuid=instance_uuid)

  InstanceInfoCacheNotFound: Info cache for instance 75da98d7-bbd5-42a2
  -ad6f-7a66e38977fa could not be found.

  2014-04-08 11:14:27.840 2894 INFO nova.virt.libvirt.driver [-]
  [instance: 75da98d7-bbd5-42a2-ad6f-7a66e38977fa] Instance destroyed
  successfully.

  Raw logs for a failure: http://logs.openstack.org/38/62038/14/check
  /check-tempest-dsvm-full/86cde16/logs/screen-n-cpu.txt.gz?level=TRACE

  Specific failure point: http://logs.openstack.org/38/62038/14/check
  /check-tempest-dsvm-
  full/86cde16/logs/screen-n-cpu.txt.gz?level=DEBUG#_2014-04-08_11_14_25_928

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1304968/+subscriptions


References