yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #67044
[Bug 1713796] [NEW] Failed unshelve does not remove allocations from destination node
Public bug reported:
During an unshelve from an offloaded instance, conductor will call the
scheduler to pick a host. The scheduler will make allocations against
the chosen node as part of that select_destinations() call. Then
conductor casts to that compute host to unshelve the instance.
If the spawn on the hypervisor fails while we've made the instance
claim:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/manager.py#L4485
Or even if the claim test fails, the allocations on the destination node
aren't removed in Placement.
The RT aborts the claim here:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/resource_tracker.py#L414
That calls _update_usage_from_instance but doesn't change the
has_ocata_computes kwarg so we get here:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/resource_tracker.py#L1041
And we don't cleanup the allocations for the instance.
The other case is if the claim fails, the instance_claim method will
raise ComputeResourcesUnavailable which would be handled here:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/claims.py#L161
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/manager.py#L4491
But we don't remove allocations or do any other cleanup there.
** Affects: nova
Importance: High
Status: Triaged
** Affects: nova/pike
Importance: High
Status: Confirmed
** Tags: placement shelve unshelve
** Also affects: nova/pike
Importance: Undecided
Status: New
** Changed in: nova/pike
Status: New => Confirmed
** Changed in: nova/pike
Importance: Undecided => High
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1713796
Title:
Failed unshelve does not remove allocations from destination node
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) pike series:
Confirmed
Bug description:
During an unshelve from an offloaded instance, conductor will call the
scheduler to pick a host. The scheduler will make allocations against
the chosen node as part of that select_destinations() call. Then
conductor casts to that compute host to unshelve the instance.
If the spawn on the hypervisor fails while we've made the instance
claim:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/manager.py#L4485
Or even if the claim test fails, the allocations on the destination
node aren't removed in Placement.
The RT aborts the claim here:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/resource_tracker.py#L414
That calls _update_usage_from_instance but doesn't change the
has_ocata_computes kwarg so we get here:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/resource_tracker.py#L1041
And we don't cleanup the allocations for the instance.
The other case is if the claim fails, the instance_claim method will
raise ComputeResourcesUnavailable which would be handled here:
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/claims.py#L161
https://github.com/openstack/nova/blob/16.0.0.0rc2/nova/compute/manager.py#L4491
But we don't remove allocations or do any other cleanup there.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1713796/+subscriptions
Follow ups