yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #76874
[Bug 1814245] Re: _disconnect_volume incorrectly called for multiattach volumes during post_live_migration
** Tags added: libvirt live-migration multiattach volumes
** Changed in: nova
Importance: Undecided => Medium
** Also affects: nova/queens
Importance: Undecided
Status: New
** Also affects: nova/rocky
Importance: Undecided
Status: New
** Changed in: nova/queens
Status: New => Triaged
** Changed in: nova/rocky
Status: New => Triaged
** Changed in: nova/queens
Importance: Undecided => Medium
** Changed in: nova/rocky
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1814245
Title:
_disconnect_volume incorrectly called for multiattach volumes during
post_live_migration
Status in OpenStack Compute (nova):
In Progress
Status in OpenStack Compute (nova) queens series:
Triaged
Status in OpenStack Compute (nova) rocky series:
Triaged
Bug description:
Description
===========
Idc5cecffa9129d600c36e332c97f01f1e5ff1f9f introduced a simple check to
ensure disconnect_volume is only called when detaching a multi-attach
volume from the final instance using it on a given host.
That change however doesn't take LM into account and more specifically
the call to _disconect_volume during post_live_migration at the end of
the migration from the source. At this point the original instance has
already moved so the call to objects.InstanceList.get_uuids_by_host
will only return one local instance that is using the volume instead
of two, allowing disconnect_volume to be called.
Depending on the backend being used this call can succeed removing the
connection to the volume for the remaining instance or os-brick can
fail in situations where it needs to flush I/O etc from the in-use
connection.
Steps to reproduce
==================
* Launch two instances attached to the same multiattach volume on the same host.
* LM one of these instances to another host.
Expected result
===============
No calls to disconnect_volume are made and the remaining instance on
the host is still able to access the multi-attach volume.
Actual result
=============
A call to disconnect_volume is made and the remaining instance is
unable to access the volume *or* the LM fails due to os-brick failures
to disconnect the in-use volume on the host.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
master
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
Libvirt + KVM
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
LVM/iSCSI with multipath enabled reproduces the os-brick failure.
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
N/A
Logs & Configs
==============
# nova show testvm2
[..]
| fault | {"message": "Unexpected error while running command. |
| | Command: multipath -f 360014054a424982306a4a659007f73b2 |
| | Exit code: 1 |
| | Stdout: u'Jan 28 16:09:29 | 360014054a424982306a4a659007f73b2: map in use\ |
| | Jan 28 16:09:29 | failed to remove multipath map 360014054a424982306a4a", "code": 500, "details": " |
| | File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 202, in decorated_function |
| | return function(self, context, *args, **kwargs) |
| | File \"/usr/lib/python2.7/site-packages/nova/compute/manager.py\", line 6299, in _post_live_migration |
| | migrate_data) |
| | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 7744, in post_live_migration |
| | self._disconnect_volume(context, connection_info, instance) |
| | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py\", line 1287, in _disconnect_volume |
| | vol_driver.disconnect_volume(connection_info, instance) |
| | File \"/usr/lib/python2.7/site-packages/nova/virt/libvirt/volume/iscsi.py\", line 74, in disconnect_volume |
| | self.connector.disconnect_volume(connection_info['data'], None) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/utils.py\", line 150, in trace_logging_wrapper |
| | result = f(*args, **kwargs) |
| | File \"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py\", line 274, in inner |
| | return f(*args, **kwargs) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py\", line 848, in disconnect_volume |
| | ignore_errors=ignore_errors) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py\", line 885, in _cleanup_connection |
| | force, exc) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py\", line 219, in remove_connection |
| | self.flush_multipath_device(multipath_name) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py\", line 275, in flush_multipath_device |
| | root_helper=self._root_helper) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/executor.py\", line 52, in _execute |
| | result = self.__execute(*args, **kwargs) |
| | File \"/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py\", line 169, in execute |
| | return execute_root(*cmd, **kwargs) |
| | File \"/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py\", line 207, in _wrap |
| | return self.channel.remote_call(name, args, kwargs) |
| | File \"/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py\", line 202, in remote_call |
| | raise exc_type(*result[2]) |
| | ", "created": "2019-01-28T07:10:09Z"}
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1814245/+subscriptions
References