yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #71638
[Bug 1704071] Re: XenAPI: volume VM live migration failed with VDI_NOT_IN_MAP
Reviewed: https://review.openstack.org/538415
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0e9cd6c4d66ca4afb95bb60edb412af9e96c546e
Submitter: Zuul
Branch: master
commit 0e9cd6c4d66ca4afb95bb60edb412af9e96c546e
Author: Brooks Kaminski <brooks.kaminski@xxxxxxxxxxxxx>
Date: Sat Jan 27 01:35:07 2018 -0600
XenAPI: XCP2.1+ Swallow VDI_NOT_IN_MAP Exception
Changes within XenAPI have enforced a more strict policy when checking
assert_can_migrate. In particular when checking the source_vdi:dest_sr
mapping it insists that the SR actually exist. This is not a problem for
local disks, however this assertation is called extremely early in the
live migration process (check_can_migrate_source) which is called from
conductor, which makes a problem for attached volumes.
This early in the process the host has just barely been chosen and no SR
information has been configured yet for these volumes or their initiators.
Additionally we cannot prepare this SR any earlier as BDM information is
not set up until the pre_live_migration method. With the options to either
skip this assertion completely or swallow the exception, I have chosen to
swallow the exception. My reasons for this are two-fold:
1. --block-migration can be called without regard for whether an iSCSI
volume is attached, and we still want to ensure that VIF, CPU and other
factors are checked, and not just skip all checks entirely.
2. Currently the Assert only exists within the --block-migration code
base but this needs to change. A future commit will remove this logic
to ensure that the commit runs without this flag. Once that is done we
want to be able to continue to use this Exception swallow logic rather
than continuing to skip the assert for all XCP2.1.0+ even without volumes.
This decision should help us handle less work in a future commit and does not
seem to align with the goals of that commit, where it does align properly here.
This commit still changes very little of the current codebase and puts us in
a good position to refactor the way this is handled at a later date, while
adding a TODO note to correct VM.assert_can_migrate only running during a
block migration.
Additionally there seems to be some confusion that the mapping data that is
generated during this initial trip through _call_live_migrate_command is needed
to continue along the code, however this data appears to be purely used to send
the mapping information through the assertation call, and is then discarded.
The only data returned from these methods is the original dest_data which
is carried into the live_migration method. The _call_live_migration method is
called again during the live_migration() method, and during this time it does
need that mapping to send along to XenAPI for the actual migration, but not
yet. Because this codebase is so confusing, I am providing a little bit of
context on the movement of these variables with some psuedocode:
---CONDUCTOR.TASKS.LIVE_MIGRATE---
LiveMigrationTask.Execute()
self._find_destination() <-----
Unrelated Work
compute.live_migration(self, host, instance, destination,
block_migrate, migration, migrate_data)
LiveMigrationTask._find_destination()
Scheduler Things. Gets a Dest ref.
_check_compatible_with_source_hyp
_call_livem_checks_on_host(host) <-----
_check_can_live_migrate_destination()
returns Host Node Name and Host Compute. That's all.
---COMPUTE.MANAGER---
_do / _check_live_migration_destination
dest_check_data = xenops.can_live_migrate_destination
(Checks for the Assert)
try:
migrate_data = check_can_live_migrate_source(dest_check_data)
return migrate_data
---VMOPS--
check_can_migrate_source(self, ctxt, instance_ref, dest_check_data)
if block_migration:
_call_live_migration_command(assert_can_migrate)
_generate_vdi_map()
Does NOT return
ALSO Does NOT return
return dest_check_data
The changes made to address this issue are a fairly simple oslo_utils
version check. To pull this data I created two new host related methods
within VMops as well as a new import of oslo_config.versionutils.
I believe these methods ultimately belong in the xenapi.host class, but
for two very small methods I believed it better to avoid such a large import
to get minimal data.
Finally, an adjustment to the Fake XenAPI driver had to be made as it currently
does not include the host details beyond hostname environment in the
create_host check. The change amends the stub dictionary to include this data.
Change-Id: Ie5295564a68f877fc061376c00f3fa706370a251
Closes-Bug: #1704071
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1704071
Title:
XenAPI: volume VM live migration failed with VDI_NOT_IN_MAP
Status in OpenStack Compute (nova):
Fix Released
Bug description:
When boot a VM with a volume attached, and then perform live
migration. It will fail on the destination compute with the following
VDI_NOT_IN_MAP error.
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: File "/opt/stack/nova/nova/compute/manager.py", line 198, in decorated_function
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: return function(self, context, *args, **kwargs)
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: File "/opt/stack/nova/nova/compute/manager.py", line 5343, in check_can_live_migrate_source
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: block_device_info)
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: File "/opt/stack/nova/nova/virt/xenapi/driver.py", line 491, in check_can_live_migrate_source
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: dest_check_data)
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: File "/opt/stack/nova/nova/virt/xenapi/vmops.py", line 2334, in check_can_live_migrate_source
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: raise exception.MigrationPreCheckError(reason=msg)
Jul 13 05:06:55 ComputeNode-1 nova-compute[7780]: MigrationPreCheckError: Migration pre-check error: assert_can_migrate failed because: VDI_NOT_IN_MAP
XenServer version is 7.1
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1704071/+subscriptions
References