yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #87021
[Bug 1836754] Re: Conflict when deleting allocations for an instance that hasn't finished building
Reviewed: https://review.opendev.org/c/openstack/nova/+/688802
Committed: https://opendev.org/openstack/nova/commit/c09d98dadb6cd69e294420ba7ecea0f9b9cfcd71
Submitter: "Zuul (22348)"
Branch: master
commit c09d98dadb6cd69e294420ba7ecea0f9b9cfcd71
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Tue Oct 15 15:49:55 2019 -0400
Add force kwarg to delete_allocation_for_instance
This adds a force kwarg to delete_allocation_for_instance which
defaults to True because that was found to be the most common use case
by a significant margin during implementation of this patch.
In most cases, this method is called when we want to delete the
allocations because they should be gone, e.g. server delete, failed
build, or shelve offload. The alternative in these cases is the caller
could trap the conflict error and retry but we might as well just force
the delete in that case (it's cleaner).
When force=True, it will DELETE the consumer allocations rather than
GET and PUT with an empty allocations dict and the consumer generation
which can result in a 409 conflict from Placement. For example, bug
1836754 shows that in one tempest test that creates a server and then
immediately deletes it, we can hit a very tight window where the method
GETs the allocations and before it PUTs the empty allocations to remove
them, something changes which results in a conflict and the server
delete fails with a 409 error.
It's worth noting that delete_allocation_for_instance used to just
DELETE the allocations before Stein [1] when we started taking consumer
generations into account. There was also a related mailing list thread
[2].
Closes-Bug: #1836754
[1] I77f34788dd7ab8fdf60d668a4f76452e03cf9888
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133374.html
Change-Id: Ife3c7a5a95c5d707983ab33fd2fbfc1cfb72f676
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1836754
Title:
Conflict when deleting allocations for an instance that hasn't
finished building
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Description
===========
When deleting an instance that hasn't finished building, we'll
sometimes get a 409 from placement as such:
Failed to delete allocations for consumer
6494d4d3-013e-478f-9ac1-37ca7a67b776. Error: {"errors": [{"status":
409, "title": "Conflict", "detail": "There was a conflict when trying
to complete your request.\\\\n\\\\n Inventory and/or allocations
changed while attempting to allocate: Another thread concurrently
updated the data. Please retry your update ", "code":
"placement.concurrent_update", "request_id":
"req-6dcd766b-f5d3-49fa-89f3-02e64079046a"}]}
Steps to reproduce
==================
1. Boot an instance
2. Don't wait for it to become active
3. Delete it immediately
Expected result
===============
The instance deletes successfully.
Actual result
=============
Nova bubbles up that error from Placement.
Logs & Configs
==============
This is being hit at a low rate in various CI tests, logstash query is
here:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Inventory%20and%2For%20allocations%20changed%20while%20attempting%20to%20allocate%3A%20Another%20thread%20concurrently%20updated%20the%20data%5C%22%20AND%20filename%3A%5C%22job-
output.txt%5C%22
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1836754/+subscriptions
References