← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1852610] Re: API allows source compute service/node deletion while instances are pending a resize confirm/revert

 

Reviewed:  https://review.opendev.org/694389
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=92fed026103b47fa2a76ea09204a4ba24c21e191
Submitter: Zuul
Branch:    master

commit 92fed026103b47fa2a76ea09204a4ba24c21e191
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Thu Nov 14 14:19:26 2019 -0500

    Block deleting compute services with in-progress migrations
    
    This builds on I0bd63b655ad3d3d39af8d15c781ce0a45efc8e3a
    which made DELETE /os-services/{service_id} fail with a 409
    response if the host has instances on it. This change checks
    for in-progress migrations involving the nodes on the host,
    either as the source or destination nodes, and returns a 409
    error response if any are found.
    
    Failling to do this can lead to orphaned resource providers
    in placement and also failing to properly confirm or revert
    a pending resize or cold migration.
    
    A release note is included for the (justified) behavior
    change in the API. A new microversion should not be required
    for this since admins should not have to opt out of broken
    behavior.
    
    Change-Id: I70e06c607045a1c0842f13069e51fef438012a9c
    Closes-Bug: #1852610


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1852610

Title:
  API allows source compute service/node deletion while instances are
  pending a resize confirm/revert

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) rocky series:
  New
Status in OpenStack Compute (nova) stein series:
  New
Status in OpenStack Compute (nova) train series:
  In Progress

Bug description:
  This is split off from bug 1829479 which is about deleting a compute
  service which had servers evacuated from it which will orphan resource
  providers in placement.

  A similar scenario is true where the API will allow deleting a source
  compute service which has migration-based allocations for the source
  node resource provider and pending instance resizes involving the
  source node. A  simple scenario is:

  1. create a server on host1
  2. resize or cold migrate it to a dest host2
  3. delete the compute service for host1

  At this point the resource provider for host1 is orphaned.

  4. try to confirm/revert the resize of the server which will fail
  because the compute node for host1 is gone and this results in the
  server going to ERROR status

  Based on the discussion in this mailing list thread:

  http://lists.openstack.org/pipermail/openstack-
  discuss/2019-November/010843.html

  We should probably have the DELETE /os-services/{service_id} API block
  trying to delete a service that has pending migrations.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1852610/+subscriptions


References