← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1856925] Re: Nova compute service exception that performs cold migration virtual machine stuck in resize state.

 

Reviewed:  https://review.opendev.org/700062
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ea2ea492a3d046d53d44039206fff69fe7e3ac61
Submitter: Zuul
Branch:    master

commit ea2ea492a3d046d53d44039206fff69fe7e3ac61
Author: Matt Riedemann <mriedem.os@xxxxxxxxx>
Date:   Thu Dec 19 13:29:50 2019 -0500

    Ensure source service is up before resizing/migrating
    
    If the source compute service is down when a resize or
    cold migrate is initiated the prep_resize cast from the
    selected destination compute service to the source will
    fail/hang. The API can validate the source compute service
    is up or fail the operation with a 409 response if the
    source service is down. Note that a host status of
    "MAINTENANCE" means the service is up but disabled by
    an administrator which is OK for resize/cold migrate.
    
    The solution here works the validation into the
    check_instance_host decorator which surprisingly isn't
    used in more places where the source host is involved
    like reboot, rebuild, snapshot, etc. This change just
    handles the resize method but is done in such a way that
    the check_instance_host decorator could be applied to
    those other methods and perform the is-up check as well.
    The decorator is made backward compatible by default.
    
    Note that Instance._save_services is added because during
    resize the Instance is updated and the services field
    is set but not actually changed, but Instance.save()
    handles object fields differently so we need to implement
    the no-op _save_services method to avoid a failure.
    
    Change-Id: I85423c7bcacff3bc465c22686d0675529d211b59
    Closes-Bug: #1856925


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1856925

Title:
  Nova compute service exception that performs cold migration virtual
  machine stuck in resize state.

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  Description:  
   In the case of a nova-compute service exception, such as down, the instance gets stuck in the resize state during cold migration and cannot perform evacuation.The command request for nova API is also issued, server_status and Task State have been changed, but compute cannot receive the request, resulting in the server State remaining in the resize State. When nova-compute is restarted, the server State becomes ERROR.It is recommended to add validation to prevent instances from entering inoperable states.
    This can also happen with commands such as stop/rebuild/reboot.
  Environment:
  1. openstack-Q;nova -version:9.1.1

  2. hypervisor: Libvirt + KVM

  3. One control node, two compute nodes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1856925/+subscriptions


References