yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #89170
[Bug 1852610] Re: [SRU] API allows source compute service/node deletion while instances are pending a resize confirm/revert
** Summary changed:
- API allows source compute service/node deletion while instances are pending a resize confirm/revert
+ [SRU] API allows source compute service/node deletion while instances are pending a resize confirm/revert
** Also affects: nova (Ubuntu)
Importance: Undecided
Status: New
** Also affects: nova (Ubuntu Bionic)
Importance: Undecided
Status: New
** Changed in: nova (Ubuntu Bionic)
Status: New => In Progress
** Changed in: nova (Ubuntu Bionic)
Assignee: (unassigned) => Brett Milford (brettmilford)
** Also affects: cloud-archive
Importance: Undecided
Status: New
** Also affects: cloud-archive/queens
Importance: Undecided
Status: New
** Changed in: cloud-archive/queens
Status: New => In Progress
** Changed in: cloud-archive/queens
Assignee: (unassigned) => Brett Milford (brettmilford)
** Description changed:
- This is split off from bug 1829479 which is about deleting a compute
- service which had servers evacuated from it which will orphan resource
- providers in placement.
+ [Impact]
+
+ * API will allow deleting a source compute service which has migration-
+ based allocations for the source node resource provider and pending
+ instance resizes involving the source node.
+
+ * Backporting the fix will improve application resilience in this case.
+
+ [Test Case]
+
+ 1. create a server on host1
+ 2. resize or cold migrate it to a dest host2
+ 3. delete the compute service for host1
+
+ At this point the resource provider for host1 is orphaned.
+
+ 4. try to confirm/revert the resize of the server which will fail
+ because the compute node for host1 is gone and this results in the
+ server going to ERROR status
+
+ [Where problems could occur]
+
+ * This change introduces an exception condition in the API and prevents
+ the erroneous deletion of compute services which would result in
+ orphaned state.
+
+ * As such we should expect to see altered behavior from the API as
+ detailed in api-ref/source/os-services.inc
+
+ * If problems were to occur they would manifest in behavior that is
+ different from both the original behavior of the API and the new
+ behavior.
+
+ --- Original Description ---
+ This is split off from bug 1829479 which is about deleting a compute service which had servers evacuated from it which will orphan resource providers in placement.
A similar scenario is true where the API will allow deleting a source
compute service which has migration-based allocations for the source
node resource provider and pending instance resizes involving the source
node. A simple scenario is:
1. create a server on host1
2. resize or cold migrate it to a dest host2
3. delete the compute service for host1
At this point the resource provider for host1 is orphaned.
4. try to confirm/revert the resize of the server which will fail
because the compute node for host1 is gone and this results in the
server going to ERROR status
Based on the discussion in this mailing list thread:
http://lists.openstack.org/pipermail/openstack-
discuss/2019-November/010843.html
We should probably have the DELETE /os-services/{service_id} API block
trying to delete a service that has pending migrations.
** Patch added: "lp1852610-bionic.debdiff"
https://bugs.launchpad.net/cloud-archive/+bug/1852610/+attachment/5599045/+files/lp1852610-bionic.debdiff
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1852610
Title:
[SRU] API allows source compute service/node deletion while instances
are pending a resize confirm/revert
Status in Ubuntu Cloud Archive:
New
Status in Ubuntu Cloud Archive queens series:
In Progress
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) queens series:
Fix Committed
Status in OpenStack Compute (nova) rocky series:
Fix Committed
Status in OpenStack Compute (nova) stein series:
Fix Committed
Status in OpenStack Compute (nova) train series:
Fix Committed
Status in nova package in Ubuntu:
New
Status in nova source package in Bionic:
In Progress
Bug description:
[Impact]
* API will allow deleting a source compute service which has
migration-based allocations for the source node resource provider and
pending instance resizes involving the source node.
* Backporting the fix will improve application resilience in this
case.
[Test Case]
1. create a server on host1
2. resize or cold migrate it to a dest host2
3. delete the compute service for host1
At this point the resource provider for host1 is orphaned.
4. try to confirm/revert the resize of the server which will fail
because the compute node for host1 is gone and this results in the
server going to ERROR status
[Where problems could occur]
* This change introduces an exception condition in the API and
prevents the erroneous deletion of compute services which would result
in orphaned state.
* As such we should expect to see altered behavior from the API as
detailed in api-ref/source/os-services.inc
* If problems were to occur they would manifest in behavior that is
different from both the original behavior of the API and the new
behavior.
--- Original Description ---
This is split off from bug 1829479 which is about deleting a compute service which had servers evacuated from it which will orphan resource providers in placement.
A similar scenario is true where the API will allow deleting a source
compute service which has migration-based allocations for the source
node resource provider and pending instance resizes involving the
source node. A simple scenario is:
1. create a server on host1
2. resize or cold migrate it to a dest host2
3. delete the compute service for host1
At this point the resource provider for host1 is orphaned.
4. try to confirm/revert the resize of the server which will fail
because the compute node for host1 is gone and this results in the
server going to ERROR status
Based on the discussion in this mailing list thread:
http://lists.openstack.org/pipermail/openstack-
discuss/2019-November/010843.html
We should probably have the DELETE /os-services/{service_id} API block
trying to delete a service that has pending migrations.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1852610/+subscriptions
References