← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1735430] Re: Report client doesn't handle RP create conflict (409) properly

 

Reviewed:  https://review.openstack.org/524263
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=112cd9cd1f31e091920e2f55fb213f78152dfd37
Submitter: Zuul
Branch:    master

commit 112cd9cd1f31e091920e2f55fb213f78152dfd37
Author: Eric Fried <efried@xxxxxxxxxx>
Date:   Thu Nov 30 11:39:11 2017 -0600

    Proper error handling by _ensure_resource_provider
    
    Previously, if _ensure_resource_provider encountered any error from the
    placement REST API, it would (sometimes log a message and) return None.
    
    Furthermore, a name conflict while creating the provider was treated the
    same as a UUID conflict, which would actually result in None being
    returned.
    
    With this change set, the error paths that previously returned None now
    raise one of the new ResourceProviderRetrievalFailed or
    ResourceProviderCreationFailed exceptions; and the name conflict path is
    detected and treated as an error condition.
    
    Note: This change set only touches the SchedulerReportClient side of
    these error conditions - it makes no attempt to add error handling to
    its callers.  Case in point, the API samples tests needed fixing because
    they were previously running into the name conflict error condition, but
    not noticing.  As currently implemented, the new exceptions will
    percolate up to ComputeManager.update_available_resource_for_node like
    any others coming from SchedulerReportClient, where they will be logged
    and ignored.
    
    Change-Id: I0c4ca6a81f213277fe7219cb905a805712f81e36
    Closes-Bug: #1735430


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1735430

Title:
  Report client doesn't handle RP create conflict (409) properly

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Confirmed
Status in OpenStack Compute (nova) pike series:
  In Progress

Bug description:
  POST /resource_providers can fail with conflict (HTTP status 409) for
  (at least) two reasons: A provider with the specified UUID exists;
  *or* a provider with the specified *name* already exists.

  In SchedulerReportClient, _ensure_resource_provider uses helper method
  _create_resource_provider, whose logic goes like this:

   POST /resource_provider { 'uuid': <uuid>, 'name': <name> }
   if 201:
       cool, return the result
   if 409:
       LOG("Another thread created a provider with this *UUID*")
       GET /resource_provider/<uuid>
       if 200:
           cool, return the result
       if 404 or any other error:
           return None
   if any other error:
       return None

  PROBLEM: If a provider exists with the desired *name* (but a different
  UUID), this code will always return None (via that 404 path).

  PROBLEM: Nobody up the stack is checking the return for None.

  What this effectively means is that _ensure_resource_provider...
  doesn't.

  IMO we should raise an exception in these error paths, forcing
  consuming code to handle them explicitly.  But at the very least, any
  code consumuing _ensure_resource_provider needs to validate that it
  succeeds.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1735430/+subscriptions


References