← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1789998] Re: ResourceProviderAllocationRetrievalFailed ERROR log message on fresh n-cpu startup

 

Reviewed:  https://review.openstack.org/609552
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=418fc93a10fe18de27c75b522a6afdc15e1c49f2
Submitter: Zuul
Branch:    master

commit 418fc93a10fe18de27c75b522a6afdc15e1c49f2
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Wed Oct 10 17:37:38 2018 -0400

    Skip _remove_deleted_instances_allocations if compute is new
    
    If this is the first start of the compute service and the compute node
    record does not exist, the resource provider won't exist either. So when
    the ResourceTracker._remove_deleted_instances_allocations method is called
    it's going to log an ERROR because get_allocations_for_resource_provider
    will raise an error since the resource provider doesn't yet exist (that
    happens later during RT._update() on the new compute node record).
    
    We can avoid calling _remove_deleted_instances_allocations if we know the
    compute node is newly created, so this adds handling for that case.
    
    Tests are updated and an unnecessary mock is removed along the way.
    
    Change-Id: I37e8ad5b14262d801702411c2c87e73550adda70
    Closes-Bug: #1789998


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1789998

Title:
  ResourceProviderAllocationRetrievalFailed ERROR log message on fresh
  n-cpu startup

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  As a result of this recent change in stein:

  https://review.openstack.org/#/c/584598/21/nova/compute/resource_tracker.py@1281

  We now get this error in the n-cpu logs on a fresh startup after the
  compute node record is created in the database but before the resource
  provider is created in placement:

  http://logs.openstack.org/98/584598/21/check/tempest-
  full/85acbda/controller/logs/screen-n-cpu.txt.gz?level=TRACE#_Aug_29_21_43_10_675029

  Aug 29 21:43:10.675029 ubuntu-xenial-rax-iad-0001643010 nova-
  compute[16853]: ERROR nova.compute.resource_tracker [None req-
  5ee3cf40-9136-42b6-b370-89f6b17ac61a None None] Skipping removal of
  allocations for deleted instances: Failed to retrieve allocations for
  resource provider 6b03ae3f-495d-472a-804b-6cac034f5661: {"errors":
  [{"status": 404, "request_id": "req-
  6ff222c2-be32-471a-8764-d7168e6de73f", "detail": "The resource could
  not be found.\n\n Resource provider '6b03ae3f-495d-472a-804b-
  6cac034f5661' not found: No resource provider with uuid 6b03ae3f-495d-
  472a-804b-6cac034f5661 found  ", "title": "Not Found"}]}:
  ResourceProviderAllocationRetrievalFailed: Failed to retrieve
  allocations for resource provider 6b03ae3f-495d-472a-804b-
  6cac034f5661: {"errors": [{"status": 404, "request_id": "req-
  6ff222c2-be32-471a-8764-d7168e6de73f", "detail": "The resource could
  not be found.\n\n Resource provider '6b03ae3f-495d-472a-804b-
  6cac034f5661' not found: No resource provider with uuid 6b03ae3f-495d-
  472a-804b-6cac034f5661 found  ", "title": "Not Found"}]}

  We could probably pass a flag down to indicate if the compute node is
  newly created and if so, and we hit that exception, to just ignore it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1789998/+subscriptions


References