← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1852610] Re: [SRU] API allows source compute service/node deletion while instances are pending a resize confirm/revert

 

This bug was fixed in the package nova - 2:17.0.13-0ubuntu5

---------------
nova (2:17.0.13-0ubuntu5) bionic; urgency=medium

  * Fixes API to disallow source compute service/node deletion while instances
    are pending a resize confirm/revert (LP: #1852610).
   - d/p/0001-lp1852610_Add_functional_recreate_test_for_bug_1829479_and_bug_1817833.patch
   - d/p/0002-lp1852610_Add_functional_recreate_test_for_bug_1852610.patch
   - d/p/0003-lp1852610_Add_functional_recreate_revert_resize_test_for_bug_1852610.patch
   - d/p/0004-lp1852610_api_allows_source_compute_service.patch

 -- Brett Milford <brett.milford@xxxxxxxxxxxxx>  Thu, 23 Jun 2022
16:41:00 +1000

** Changed in: nova (Ubuntu Bionic)
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1852610

Title:
  [SRU] API allows source compute service/node deletion while instances
  are pending a resize confirm/revert

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive queens series:
  Won't Fix
Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) queens series:
  Fix Committed
Status in OpenStack Compute (nova) rocky series:
  Fix Committed
Status in OpenStack Compute (nova) stein series:
  Fix Committed
Status in OpenStack Compute (nova) train series:
  Fix Committed
Status in nova package in Ubuntu:
  Invalid
Status in nova source package in Bionic:
  Fix Released

Bug description:
  [Impact]

   * API will allow deleting a source compute service which has
  migration-based allocations for the source node resource provider and
  pending instance resizes involving the source node.

   * Backporting the fix will improve application resilience in this
  case.

  [Test Case]

   1. create a server on host1
   2. resize or cold migrate it to a dest host2
   3. delete the compute service for host1

   At this point the resource provider for host1 is orphaned.

   4. try to confirm/revert the resize of the server which will fail
  because the compute node for host1 is gone and this results in the
  server going to ERROR status

  [Where problems could occur]

   * This change introduces an exception condition in the API and
  prevents the erroneous deletion of compute services which would result
  in orphaned state.

   * As such we should expect to see altered behavior from the API as
  detailed in api-ref/source/os-services.inc

   * If problems were to occur they would manifest in behavior that is
  different from both the original behavior of the API and the new
  behavior.

  --- Original Description ---
  This is split off from bug 1829479 which is about deleting a compute service which had servers evacuated from it which will orphan resource providers in placement.

  A similar scenario is true where the API will allow deleting a source
  compute service which has migration-based allocations for the source
  node resource provider and pending instance resizes involving the
  source node. A  simple scenario is:

  1. create a server on host1
  2. resize or cold migrate it to a dest host2
  3. delete the compute service for host1

  At this point the resource provider for host1 is orphaned.

  4. try to confirm/revert the resize of the server which will fail
  because the compute node for host1 is gone and this results in the
  server going to ERROR status

  Based on the discussion in this mailing list thread:

  http://lists.openstack.org/pipermail/openstack-
  discuss/2019-November/010843.html

  We should probably have the DELETE /os-services/{service_id} API block
  trying to delete a service that has pending migrations.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1852610/+subscriptions



References