← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 2080436] Re: Live migration breaks VM on NUMA enabled systems with shared storage

 

Reviewed:  https://review.opendev.org/c/openstack/nova/+/928970
Committed: https://opendev.org/openstack/nova/commit/035b8404fce878b0a88c4741bea46135b6af51e8
Submitter: "Zuul (22348)"
Branch:    master

commit 035b8404fce878b0a88c4741bea46135b6af51e8
Author: Matthew N Heler <matthew.heler@xxxxxxxxxxx>
Date:   Wed Sep 11 12:28:15 2024 -0500

    Fix regression with live migration on shared storage
    
    The commit c1ccc1a3165ec1556c605b3b036274e992b0a09d introduced
    a regression when NUMA live migration was done on shared storage
    
    The live migration support for the power mgmt feature means we need to
    call driver.cleanup() for all NUMA instances to potentially offline
    pcpus that are not used any more after the instance is migrated away.
    However this change exposed an issue with the disk cleanup logic. Nova
    should never delete the instance directory if that directory is on
    shared storage (e.g. the nova instances path is backed by NFS).
    
    This patch will fix that behavior so live migration will function
    
    Closes-Bug: #2080436
    Change-Id: Ia2bbb5b4ac728563a8aabd857ed0503449991df1


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2080436

Title:
  Live migration breaks VM on NUMA enabled systems with shared storage

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  The commit c1ccc1a3165ec1556c605b3b036274e992b0a09d introduced
  a regression when NUMA live migration was done on shared storage

              power_management_possible = (
                  'dst_numa_info' in migrate_data and
                  migrate_data.dst_numa_info is not None)
              # No instance booting at source host, but instance dir
              # must be deleted for preparing next block migration
              # must be deleted for preparing next live migration w/o shared
              # storage
              # vpmem must be cleaned
              do_cleanup = (not migrate_data.is_shared_instance_path or
                            has_vpmem or has_mdevs or power_management_possible)

  Based on the commit, if any type of NUMA system is used with shared
  storage. Live migration will delete the backing folder for the VM,
  making the VM unusable for future operations.

  My team is experiencing this issue on 2024.1

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2080436/+subscriptions



References