yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #64364
[Bug 1693899] Re: Cleaning up stale resource allocations can fail with InstanceNotFound
Reviewed: https://review.openstack.org/468517
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=98b8e39ac5f7b3f2bb06ca415bbb806705461d74
Submitter: Jenkins
Branch: master
commit 98b8e39ac5f7b3f2bb06ca415bbb806705461d74
Author: Chris Dent <cdent@xxxxxxxxxxxxx>
Date: Fri May 26 19:12:46 2017 +0000
Catch InstanceNotFound when deleting allocations
If the list of allocations that are going to be deleted includes an
instance that can no longer be loaded by get_by_uuid (because
something has already removed it) we need to catch InstanceNotFound
and remove the allocations as planned. Without this change, the
exception is not caught, causing an error and leaving behind the
allocations we don't want to be there.
Change-Id: I204b077c287141a7f5643b2cc0065da2ba395c03
Closes-Bug: #1693899
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1693899
Title:
Cleaning up stale resource allocations can fail with InstanceNotFound
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Noticed what appears to be a race on delete of an instance where we're
deleting resource provider allocations records for a deleted instance
and when we go to get the instance from the database it's already
deleted:
http://logs.openstack.org/32/468232/1/gate/gate-tempest-dsvm-cells-
ubuntu-
xenial/ad969ba/logs/screen-n-cpu.txt.gz?level=TRACE#_May_26_17_29_52_292260
May 26 17:29:52.292260 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager [req-c0f69b9e-c6eb-43e1-9bc9-372a42901987 None None] Error updating resources for node ubuntu-xenial-ovh-bhs1-9010496.
May 26 17:29:52.292537 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager Traceback (most recent call last):
May 26 17:29:52.292730 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/manager.py", line 6588, in update_available_resource_for_node
May 26 17:29:52.292908 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager rt.update_available_resource(context, nodename)
May 26 17:29:52.293082 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 626, in update_available_resource
May 26 17:29:52.293266 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager self._update_available_resource(context, resources)
May 26 17:29:52.293430 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner
May 26 17:29:52.293592 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager return f(*args, **kwargs)
May 26 17:29:52.293750 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 667, in _update_available_resource
May 26 17:29:52.293909 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager self._update_usage_from_instances(context, instances, nodename)
May 26 17:29:52.294070 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 1047, in _update_usage_from_instances
May 26 17:29:52.294235 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager self._remove_deleted_instances_allocations(context, cn)
May 26 17:29:52.294453 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 1061, in _remove_deleted_instances_allocations
May 26 17:29:52.294628 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager expected_attrs=[])
May 26 17:29:52.294797 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 177, in wrapper
May 26 17:29:52.294983 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager args, kwargs)
May 26 17:29:52.295107 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/opt/stack/new/nova/nova/conductor/rpcapi.py", line 240, in object_class_action_versions
May 26 17:29:52.295205 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager args=args, kwargs=kwargs)
May 26 17:29:52.295290 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 169, in call
May 26 17:29:52.295374 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager retry=self.retry)
May 26 17:29:52.295457 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 123, in _send
May 26 17:29:52.295547 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager timeout=timeout, retry=retry)
May 26 17:29:52.295630 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 505, in send
May 26 17:29:52.295712 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager retry=retry)
May 26 17:29:52.295795 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 496, in _send
May 26 17:29:52.295877 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager raise result
May 26 17:29:52.295959 ubuntu-xenial-ovh-bhs1-9010496 nova-compute[11974]: ERROR nova.compute.manager InstanceNotFound_Remote: Instance 9ecdbc52-8f98-45aa-9814-ade07faa1026 could not be found.
We should handle the InstanceNotFound since in this path we are
cleaning up anyway.
Looks like this is triggered by something semi-recent:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Error%20updating%20resources%20for%20node%5C%22%20AND%20message%3A%5C%22_remove_deleted_instances_allocations%5C%22%20AND%20message%3A%5C%22InstanceNotFound_Remote%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1693899/+subscriptions
References