← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1836754] Re: Conflict when deleting allocations for an instance that hasn't finished building

 

Reviewed:  https://review.opendev.org/c/openstack/nova/+/688802
Committed: https://opendev.org/openstack/nova/commit/c09d98dadb6cd69e294420ba7ecea0f9b9cfcd71
Submitter: "Zuul (22348)"
Branch:    master

commit c09d98dadb6cd69e294420ba7ecea0f9b9cfcd71
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Tue Oct 15 15:49:55 2019 -0400

    Add force kwarg to delete_allocation_for_instance
    
    This adds a force kwarg to delete_allocation_for_instance which
    defaults to True because that was found to be the most common use case
    by a significant margin during implementation of this patch.
    In most cases, this method is called when we want to delete the
    allocations because they should be gone, e.g. server delete, failed
    build, or shelve offload. The alternative in these cases is the caller
    could trap the conflict error and retry but we might as well just force
    the delete in that case (it's cleaner).
    
    When force=True, it will DELETE the consumer allocations rather than
    GET and PUT with an empty allocations dict and the consumer generation
    which can result in a 409 conflict from Placement. For example, bug
    1836754 shows that in one tempest test that creates a server and then
    immediately deletes it, we can hit a very tight window where the method
    GETs the allocations and before it PUTs the empty allocations to remove
    them, something changes which results in a conflict and the server
    delete fails with a 409 error.
    
    It's worth noting that delete_allocation_for_instance used to just
    DELETE the allocations before Stein [1] when we started taking consumer
    generations into account. There was also a related mailing list thread
    [2].
    
    
    Closes-Bug: #1836754
    
    [1] I77f34788dd7ab8fdf60d668a4f76452e03cf9888
    [2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133374.html
    
    Change-Id: Ife3c7a5a95c5d707983ab33fd2fbfc1cfb72f676


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1836754

Title:
  Conflict when deleting allocations for an instance that hasn't
  finished building

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description
  ===========

  When deleting an instance that hasn't finished building, we'll
  sometimes get a 409 from placement as such:

  Failed to delete allocations for consumer
  6494d4d3-013e-478f-9ac1-37ca7a67b776. Error: {"errors": [{"status":
  409, "title": "Conflict", "detail": "There was a conflict when trying
  to complete your request.\\\\n\\\\n Inventory and/or allocations
  changed while attempting to allocate: Another thread concurrently
  updated the data. Please retry your update  ", "code":
  "placement.concurrent_update", "request_id":
  "req-6dcd766b-f5d3-49fa-89f3-02e64079046a"}]}

  Steps to reproduce
  ==================

  1. Boot an instance
  2. Don't wait for it to become active
  3. Delete it immediately

  Expected result
  ===============

  The instance deletes successfully.

  Actual result
  =============

  Nova bubbles up that error from Placement.

  Logs & Configs
  ==============

  This is being hit at a low rate in various CI tests, logstash query is
  here:

  http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Inventory%20and%2For%20allocations%20changed%20while%20attempting%20to%20allocate%3A%20Another%20thread%20concurrently%20updated%20the%20data%5C%22%20AND%20filename%3A%5C%22job-
  output.txt%5C%22

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1836754/+subscriptions



References