← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1382440] [NEW] Detaching multipath volume doesn't work properly when using different targets with same portal for each multipath device

 

Public bug reported:

Overview:
On Icehouse(2014.1.2) with "iscsi_use_multipath=true", detaching iSCSI 
multipath volume doesn't work properly. When we use different targets(IQNs) 
associated with same portal for each different multipath device, all of 
the targets will be deleted via disconnect_volume().

This problem is not yet fixed in upstream. However, the attached patch
fixes this problem.

Steps to Reproduce:

We can easily reproduce this issue without any special storage
system in the following Steps:

  1. configure "iscsi_use_multipath=True" in nova.conf on compute node.
  2. configure "volume_driver=cinder.volume.drivers.lvm.LVMISCSIDriver"
     in cinder.conf on cinder node.
  2. create an instance.
  3. create 3 volumes and attach them to the instance.
  4. detach one of these volumes.
  5. check "multipath -ll" and "iscsiadm --mode session".

Detail:

This problem was introduced with the following patch which modified
attaching and detaching volume operations for different targets
associated with different portals for the same multipath device.

  commit 429ac4dedd617f8c1f7c88dd8ece6b7d2f2accd0
  Author: Xing Yang <xing.yang@xxxxxxx>
  Date:   Date: Mon Jan 6 17:27:28 2014 -0500

    Fixed a problem in iSCSI multipath

We found out that:

>         # Do a discovery to find all targets.
>         # Targets for multiple paths for the same multipath device
>         # may not be the same.
>         out = self._run_iscsiadm_bare(['-m',
>                                       'discovery',
>                                       '-t',
>                                       'sendtargets',
>                                       '-p',
>                                       iscsi_properties['target_portal']],
>                                       check_exit_code=[0, 255])[0] \
>             or ""
>
>         ips_iqns = self._get_target_portals_from_iscsiadm_output(out)
...
>         # If no other multipath device attached has the same iqn
>         # as the current device
>         if not in_use:
>             # disconnect if no other multipath devices with same iqn
>             self._disconnect_mpath(iscsi_properties, ips_iqns)
>             return
>         elif multipath_device not in devices:
>             # delete the devices associated w/ the unused multipath
>             self._delete_mpath(iscsi_properties, multipath_device, ips_iqns)

When we use different targets(IQNs) associated with same portal for each different
multipath device, the ips_iqns has all targets in compute node from the result of
"iscsiadm -m discovery -t sendtargets -p <the same portal>".
Then, the _delete_mpath() deletes all of the targets in the ips_iqns
via /sys/block/sdX/device/delete.

For example, we create an instance and attach 3 volumes to the instance:

  # iscsiadm --mode session
  tcp: [17] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-5c526ffa-ba88-4fe2-a570-9e35c4880d12
  tcp: [18] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b4495e7e-b611-4406-8cce-4681ac1e36de
  tcp: [19] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b2c01f6a-5723-40e7-9f21-f6b728021b0e
  # multipath -ll
  33000000300000001 dm-7 IET,VIRTUAL-DISK
  size=4.0G features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
     `- 23:0:0:1 sdd 8:48 active ready running
  33000000100000001 dm-5 IET,VIRTUAL-DISK
  size=2.0G features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
     `- 21:0:0:1 sdb 8:16 active ready running
  33000000200000001 dm-6 IET,VIRTUAL-DISK
  size=3.0G features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=1 status=active
     `- 22:0:0:1 sdc 8:32 active ready running

Then we detach one of these volumes:

  # nova volume-detach 95f959cd-d180-4063-ae03-9d21dbd7cc50 5c526ffa-
ba88-4fe2-a570-9e35c4880d12

As a result of detaching the volume, the compute node remains 3 iSCSI sessions
and the instance fails to access the attached multipath devices:

  # iscsiadm --mode session
  tcp: [17] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-5c526ffa-ba88-4fe2-a570-9e35c4880d12
  tcp: [18] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b4495e7e-b611-4406-8cce-4681ac1e36de
  tcp: [19] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b2c01f6a-5723-40e7-9f21-f6b728021b0e
  # multipath -ll
  33000000300000001 dm-7 ,
  size=4.0G features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=0 status=enabled
     `- #:#:#:# -   #:# failed faulty running
  33000000200000001 dm-6 ,
  size=3.0G features='0' hwhandler='0' wp=rw
  `-+- policy='round-robin 0' prio=0 status=enabled
     `- #:#:#:# -   #:# failed faulty running

** Affects: nova
     Importance: Undecided
         Status: New

** Patch added: "Patch to fix removing wrong iSCSI multipath device issue"
   https://bugs.launchpad.net/bugs/1382440/+attachment/4238782/+files/fix-removing-wrong-device-problem-in-iscsi-multipath.patch

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1382440

Title:
  Detaching multipath volume doesn't work properly when using different
  targets with same portal for each multipath device

Status in OpenStack Compute (Nova):
  New

Bug description:
  Overview:
  On Icehouse(2014.1.2) with "iscsi_use_multipath=true", detaching iSCSI 
  multipath volume doesn't work properly. When we use different targets(IQNs) 
  associated with same portal for each different multipath device, all of 
  the targets will be deleted via disconnect_volume().

  This problem is not yet fixed in upstream. However, the attached patch
  fixes this problem.

  Steps to Reproduce:

  We can easily reproduce this issue without any special storage
  system in the following Steps:

    1. configure "iscsi_use_multipath=True" in nova.conf on compute node.
    2. configure "volume_driver=cinder.volume.drivers.lvm.LVMISCSIDriver"
       in cinder.conf on cinder node.
    2. create an instance.
    3. create 3 volumes and attach them to the instance.
    4. detach one of these volumes.
    5. check "multipath -ll" and "iscsiadm --mode session".

  Detail:

  This problem was introduced with the following patch which modified
  attaching and detaching volume operations for different targets
  associated with different portals for the same multipath device.

    commit 429ac4dedd617f8c1f7c88dd8ece6b7d2f2accd0
    Author: Xing Yang <xing.yang@xxxxxxx>
    Date:   Date: Mon Jan 6 17:27:28 2014 -0500

      Fixed a problem in iSCSI multipath

  We found out that:

  >         # Do a discovery to find all targets.
  >         # Targets for multiple paths for the same multipath device
  >         # may not be the same.
  >         out = self._run_iscsiadm_bare(['-m',
  >                                       'discovery',
  >                                       '-t',
  >                                       'sendtargets',
  >                                       '-p',
  >                                       iscsi_properties['target_portal']],
  >                                       check_exit_code=[0, 255])[0] \
  >             or ""
  >
  >         ips_iqns = self._get_target_portals_from_iscsiadm_output(out)
  ...
  >         # If no other multipath device attached has the same iqn
  >         # as the current device
  >         if not in_use:
  >             # disconnect if no other multipath devices with same iqn
  >             self._disconnect_mpath(iscsi_properties, ips_iqns)
  >             return
  >         elif multipath_device not in devices:
  >             # delete the devices associated w/ the unused multipath
  >             self._delete_mpath(iscsi_properties, multipath_device, ips_iqns)

  When we use different targets(IQNs) associated with same portal for each different
  multipath device, the ips_iqns has all targets in compute node from the result of
  "iscsiadm -m discovery -t sendtargets -p <the same portal>".
  Then, the _delete_mpath() deletes all of the targets in the ips_iqns
  via /sys/block/sdX/device/delete.

  For example, we create an instance and attach 3 volumes to the
  instance:

    # iscsiadm --mode session
    tcp: [17] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-5c526ffa-ba88-4fe2-a570-9e35c4880d12
    tcp: [18] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b4495e7e-b611-4406-8cce-4681ac1e36de
    tcp: [19] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b2c01f6a-5723-40e7-9f21-f6b728021b0e
    # multipath -ll
    33000000300000001 dm-7 IET,VIRTUAL-DISK
    size=4.0G features='0' hwhandler='0' wp=rw
    `-+- policy='round-robin 0' prio=1 status=active
       `- 23:0:0:1 sdd 8:48 active ready running
    33000000100000001 dm-5 IET,VIRTUAL-DISK
    size=2.0G features='0' hwhandler='0' wp=rw
    `-+- policy='round-robin 0' prio=1 status=active
       `- 21:0:0:1 sdb 8:16 active ready running
    33000000200000001 dm-6 IET,VIRTUAL-DISK
    size=3.0G features='0' hwhandler='0' wp=rw
    `-+- policy='round-robin 0' prio=1 status=active
       `- 22:0:0:1 sdc 8:32 active ready running

  Then we detach one of these volumes:

    # nova volume-detach 95f959cd-d180-4063-ae03-9d21dbd7cc50 5c526ffa-
  ba88-4fe2-a570-9e35c4880d12

  As a result of detaching the volume, the compute node remains 3 iSCSI sessions
  and the instance fails to access the attached multipath devices:

    # iscsiadm --mode session
    tcp: [17] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-5c526ffa-ba88-4fe2-a570-9e35c4880d12
    tcp: [18] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b4495e7e-b611-4406-8cce-4681ac1e36de
    tcp: [19] 192.168.0.55:3260,1 iqn.2010-10.org.openstack:volume-b2c01f6a-5723-40e7-9f21-f6b728021b0e
    # multipath -ll
    33000000300000001 dm-7 ,
    size=4.0G features='0' hwhandler='0' wp=rw
    `-+- policy='round-robin 0' prio=0 status=enabled
       `- #:#:#:# -   #:# failed faulty running
    33000000200000001 dm-6 ,
    size=3.0G features='0' hwhandler='0' wp=rw
    `-+- policy='round-robin 0' prio=0 status=enabled
       `- #:#:#:# -   #:# failed faulty running

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1382440/+subscriptions


Follow ups

References