yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #40699
[Bug 1385798] Re: Multipath ISCSI connections left open after disconnecting volume with libvirt
There is a monster backport patch proposed for this on stable/kilo:
https://review.openstack.org/#/c/229152
It sounds like this is already fixed in os-brick which nova uses since
liberty, and the attempt is to get the changes from os-brick all
backported to nova's libvirt iscsi volume driver in kilo using this bug
as the coordinator.
** Also affects: nova/kilo
Importance: Undecided
Status: New
** Tags added: kilo-backport-potential
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1385798
Title:
Multipath ISCSI connections left open after disconnecting volume with
libvirt
Status in OpenStack Compute (nova):
Confirmed
Status in OpenStack Compute (nova) kilo series:
New
Bug description:
When disconnecting a volume from an instance the ISCSI multipath
connection is not always cleaned up correctly. When running the
temepest tests we see test failures related to this as the connection
is not closed, but then it is requesting to disconnect through the
cinder driver which ends up breaking the iscsi connection. The end
result being that there are still entries in /dev/disk/by-path for the
old ISCSI connections, but they are in an error state and cannot be
used.
In the syslog we get errors like:
Oct 25 17:23:21 localhost kernel: [ 2974.200680] connection44:0: detected conn error (1020)
Oct 25 17:23:21 localhost kernel: [ 2974.200819] connection43:0: detected conn error (1020)
Oct 25 17:23:21 localhost iscsid: Kernel reported iSCSI connection 44:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
Oct 25 17:23:21 localhost iscsid: Kernel reported iSCSI connection 43:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
After running the tests if I run "multipath -l" there are numerous
entries (which shouldn't exist anymore), and if I run "iscsiadm -m
node" it will show the connections to the backend, even though they
are supposed to have been disconnected (and have been on the backend
via the cinder driver).
The disconnect code in cinder/brick seems to not suffere from these
issues, from the looks of the source code it works a little bit
differently when disconnecting multipath volumes and will always clean
up the scsi connection first. We might need to do something more like
that in nova/virt/libvirt/volume.py too.
I'm seeing this on the latest master and Juno branches, haven't yet
tested on icehouse but looks like it will probably repro there too.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1385798/+subscriptions
References