← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1889108] Re: failures during driver.pre_live_migration remove source attachments during rollback

 

Reviewed:  https://review.opendev.org/743319
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2102f1834a6ac9fd870bfb457b28a2172f33e281
Submitter: Zuul
Branch:    master

commit 2102f1834a6ac9fd870bfb457b28a2172f33e281
Author: Lee Yarwood <lyarwood@xxxxxxxxxx>
Date:   Mon Jul 27 19:27:24 2020 +0100

    compute: Don't delete the original attachment during pre LM rollback
    
    I0bfb11296430dfffe9b091ae7c3a793617bd9d0d introduced support for live
    migration with cinderv3 volume attachments during Queens. This initial
    support handled failures in pre_live_migration directly by removing any
    attachments created on the destination and reverting to the original
    attachment ids before re-raising the caught exception to the source
    compute. It also added rollback code within the main
    _rollback_live_migration method but missed that this would also be
    called during a pre_live_migration rollback.
    
    As a result after a failure in pre_live_migration
    _rollback_live_migration will attempt to delete the source host volume
    attachments referenced by the bdm before updating the bdms with the now
    non-existent attachment ids, leaving the volumes in an `available` state
    in Cinder as they have no attachment records associated with them
    anymore.
    
    This change aims to resolve this within _rollback_volume_bdms by
    ensuring that the current and original attachment_ids are not equal
    before requesting that the current attachment referenced by the bdm is
    deleted. When called after a failure in pre_live_migration this should
    result in no attempt being made to remove the original source host
    attachments from Cinder.
    
    Note that the following changes muddy the waters slightly here but
    introduced no actual changes to the logic within
    _rollback_live_migration:
    
    * I0f3ab6604d8b79bdb75cf67571e359cfecc039d8 reworked some of the error
      handling in Rocky but isn't the source of the issue here.
    
    * Ibe9215c07a1ee00e0e121c69bcf7ee1b1b80fae0 reworked
      _rollback_live_migration to use the provided source_bdms.
    
    * I6bc73e8c8f98d9955f33f309beb8a7c56981b553 then refactored
      _rollback_live_migration, moving the logic into a self contained
      _rollback_volume_bdms method.
    
    Closes-Bug: #1889108
    Change-Id: I9edb36c4df1cc0d8b529e669f06540de71766085


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1889108

Title:
  failures during driver.pre_live_migration remove source attachments
  during rollback

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) queens series:
  New
Status in OpenStack Compute (nova) rocky series:
  New
Status in OpenStack Compute (nova) stein series:
  New
Status in OpenStack Compute (nova) train series:
  New
Status in OpenStack Compute (nova) ussuri series:
  In Progress

Bug description:
  Description
  ===========

  $subject, the initial rollback and removal of any destination volume
  attachments is then repeated for the source volume attachments,
  leaving the volumes connected on the host but listed as `available` in
  cinder.

  Steps to reproduce
  ==================
  Cause a failure during the call to driver.pre_live_migration with volumes attached.

  Expected result
  ===============
  Any volume attachments for the destination host are deleted during the rollback.

  Actual result
  =============
  Both sets of volumes attachments for the destination *and* the source are removed.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/

     eeeb964a5f65e6ac31dfb34b1256aaf95db5ba3a

  2. Which hypervisor did you use?
     (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
     What's the version of that?

     libvirt + KVM

  2. Which storage type did you use?
     (For example: Ceph, LVM, GPFS, ...)
     What's the version of that?

     N/A

  3. Which networking type did you use?
     (For example: nova-network, Neutron with OpenVSwitch, ...)

     N/A

  Logs & Configs
  ==============

  When live-migration fails with attached volume changed to active and still in nova
  https://bugzilla.redhat.com/show_bug.cgi?id=1860914

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1889108/+subscriptions


References