yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #95879
[Bug 1899835] Re: n-cpu attempts to disconnect volumes after early pre_live_migration failures on the destination during a live migration
Reviewed: https://review.opendev.org/c/openstack/nova/+/946600
Committed: https://opendev.org/openstack/nova/commit/5a55a78d510b86975f0f4f8f43ee1feef7206244
Submitter: "Zuul (22348)"
Branch: master
commit 5a55a78d510b86975f0f4f8f43ee1feef7206244
Author: melanie witt <melwittt@xxxxxxxxx>
Date: Mon Apr 7 18:25:40 2025 -0700
live migration: Avoid volume rollback mismatches
The tl;dr is to 1) avoid trying to disconnect volumes on the
destination if they were never connected in the first place and
2) avoid trying to disconnect volumes on the destination using block
device info for the source.
Details:
* Only remotely disconnect volumes on the destination if the failure
was not during pre_live_migration(). When pre_live_migration() fails,
its exception handling deletes the Cinder attachment that was created
before re-raising and returning from the RPC call. And the BDM
connection_info in the database is not guaranteed to reference the
destination because a failure could have happened after the Cinder
attachment was created but before the new connection_info was saved
back to the database. In this scenario, there is no way to reliably
disconnect volumes in the destination remotely from the source because
the destination connection_info needed to do it might not be
available.
* Due to the first point, this adds exception handling to disconnect
the volumes while still on the destination, while the destination
connection_info is still available instead of trying to do it
remotely from the source afterward.
* Do not pass Cinder volume block_device_info when calling
rollback_live_migration_on_destination() because volume BDM records
have already been rolled back to contain info for the source by
that point. Not passing volume block_device_info will prevent
driver.destroy() and subsequently driver.cleanup() from attempting to
disconnect volumes on the destination using connection_info for the
source.
Closes-Bug: #1899835
Change-Id: Ia62b99a16bfc802b8ba895c31780e9956aa74c2d
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1899835
Title:
n-cpu attempts to disconnect volumes after early pre_live_migration
failures on the destination during a live migration
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Description
===========
When live migrating an instance with volumes attached
pre_live_migration on the destination will initially attempt to map
these volumes to the destination by creating a volume attachment
(cinderv3) or calling initialize_connection (cinderv2).
At present if either call fails the generic live migration rollback
code is called and an attempt is made to disconnect volumes from the
destination ignoring the fact that they have not been mapped or
connected to by that host.
Steps to reproduce
==================
* Live migrate an instance with volumes attached, ensuring calls to
either cinder API fail during pre_live_migration.
Expected result
===============
No attempt is made to disconnect volumes from the destination.
Actual result
=============
n-cpu attempts and fails to disconnect volumes from the destination using connection_info from the source bdms.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
117508129461436e13c148bb068b0775d67e85d3
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
Libvirt
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
N/A
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
N/A
Logs & Configs
==============
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1899835/+subscriptions
References