← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1719933] Re: placement server needs to retry allocations, server-side

 

Reviewed:  https://review.openstack.org/586048
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=72e4c4c8d7fb146862b899337626485dad10f15b
Submitter: Zuul
Branch:    master

commit 72e4c4c8d7fb146862b899337626485dad10f15b
Author: Chris Dent <cdent@xxxxxxxxxxxxx>
Date:   Thu Jul 26 12:00:14 2018 +0100

    [placement] Retry allocation writes server side
    
    This change adds a fast retry loop around
    AllocationList._set_allocations if a resource provider generation
    conflict happens. It turns out that under high concurrency of allocation
    claims being made on the same resource provider conflicts can be quite
    common and client side retries are insufficient.
    
    Because both consumer generation and resource provider generations had
    raised the same exception there was no way to distinguish between the
    two so a child of ConcurrentUpdateDetected has been created as
    ResourceProviderConcurrentUpdateDetected. In the future this will allow
    us to send different error codes to the client as well, but that change
    is not done here.
    
    When the conflict is detected, all the resource providers in the
    AllocationList are reloaded and the list objects refreshed.
    
    Logging is provided to indicate:
    
    * at debug that a retry is going to happen
    * at warning that all the retries failed and the client is going to
      see the conflict
    
    The tests for this are a bit funky: Some mocks are used to cause the
    conflicts, then the real actions after a couple of iterations.
    
    Change-Id: Id614d609fc8f3ed2d2ff29a2b52143f53b3b1b9a
    Closes-Bug: #1719933


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1719933

Title:
  placement server needs to retry allocations, server-side

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Long time ago a todo was made in placement:
  https://github.com/openstack/nova/blob/faede889d3620f8ff0131a7a4c6b9c1bc844cd06/nova/objects/resource_provider.py#L1837-L1839

  We need to implement that TODO, this is a note to self.

  This is related to what may be a different bug: when heavily loaded
  with many single requests, the placement server is unexpectedly
  receiving 409's about generation problems. Discussion of that led to
  remembering that this TODO needs to be fixed. Fixing the TODO needs to
  be done regardless of that problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1719933/+subscriptions


References