yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #72525
[Bug 1679750] Re: Allocations are not cleaned up in placement for instance 'local delete' case
Reviewed: https://review.openstack.org/560706
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ea9d0af31395fbe1686fa681cd91226ee580796e
Submitter: Zuul
Branch: master
commit ea9d0af31395fbe1686fa681cd91226ee580796e
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date: Wed Apr 11 21:24:43 2018 -0400
Delete allocations from API if nova-compute is down
When performing a "local delete" of an instance, we
need to delete the allocations that the instance has
against any resource providers in Placement.
It should be noted that without this change, restarting
the nova-compute service will delete the allocations
for its compute node (assuming the compute node UUID
is the same as before the instance was deleted). That
is shown in the existing functional test modified here.
The more important reason for this change is that in
order to fix bug 1756179, we need to make sure the
resource provider allocations for a given compute node
are gone by the time the compute service is deleted.
This adds a new functional test and a release note for
the new behavior and need to configure nova-api for
talking to placement, which is idempotent if
not configured thanks to the @safe_connect decorator
used in SchedulerReportClient.
Closes-Bug: #1679750
Related-Bug: #1756179
Change-Id: If507e23f0b7e5fa417041c3870d77786498f741d
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1679750
Title:
Allocations are not cleaned up in placement for instance 'local
delete' case
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) pike series:
Confirmed
Status in OpenStack Compute (nova) queens series:
Confirmed
Bug description:
This is semi-related to bug 1661312 for evacuate.
This is the case:
1. Create an instance on host A successfully. There are allocation
records in the placement API for the instance (consumer for the
allocation records) and host A (resource provider).
2. Host A goes down.
3. Delete the instance. This triggers the local delete flow in the
compute API where we can't RPC cast to the compute to delete the
instance because the nova-compute service is down. So we do the delete
in the database from the compute API (local to compute API, hence
local delete).
The problem is in #3 we don't remove the allocations for the instance
from the host A resource provider during the local delete flow.
Maybe this doesn't matter while host A is down, since the scheduler
can't schedule to it anyway. But if host A comes back up, it will have
allocations tied to it for deleted instances.
On init_host in the compute service we call _complete_partial_deletion
but that's only for instances with a vm_state of 'deleted' but aren't
actually deleted in the database. I don't think that's going to cover
this case because the local delete code in the compute API calls
instance.destroy() which deletes the instance from the database
(updates instances.deleted != 0 in the DB so it's "soft" deleted).
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1679750/+subscriptions
References