yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32855
[Bug 1454512] [NEW] Device for other volume is deleted unexpected during volume detach when iscsi multipath is used
Public bug reported:
We found this issue during testing volume detachment when iSCSI
multipath is used. When a same iSCSI protal and iqn is shared by
multiple LUNs, device from other volume maybe be deleted unexpected.
This is found both in Kilo and the latest code.
For example, the devices under /dev/disk/by-path may looks like below when LUN 23 and 231 are from a same storage system and a same iSCSI protal and iqn are used. ls /dev/disk/by-path
ip-192.168.3.50:3260-iscsi-<iqna>-lun-23
ip-192.168.3.50:3260-iscsi-<iqna>-lun-231
ip-192.168.3.51:3260-iscsi-<iqnb>-lun-23
ip-192.168.3.51:3260-iscsi-<iqnb>-lun-231
When we try to detach volume corresponding LUN 23 from the host, we
noticed that the devices regarding to LUN 231 are also deleted which may
cause the data unavailable.
Why this happen? After digging into the nova code, below is the clue:
nova/virt/libvirt/volume.py
770 def _delete_mpath(self, iscsi_properties, multipath_device, ips_iqns):
771 entries = self._get_iscsi_devices()
772 # Loop through ips_iqns to construct all paths
773 iqn_luns = []
774 for ip, iqn in ips_iqns:
775 iqn_lun = '%s-lun-%s' % (iqn,
776 iscsi_properties.get('target_lun', 0))
777 iqn_luns.append(iqn_lun)
778 for dev in ['/dev/disk/by-path/%s' % dev for dev in entries]:
779 for iqn_lun in iqn_luns:
780 if iqn_lun in dev: ==> This is incorrect, device for LUN 231 will made this be True.
781 self._delete_device(dev)
782
783 self._rescan_multipath()
Due to the incorrect logic in line 780, detach LUN xx will deleted devices for other LUNs starts with xx, such as xxy, xxz. We could use dev.endswith(iqn_lun) to avoid it.
===================================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
** Affects: nova
Importance: Undecided
Assignee: Tina Tang (tina-tang)
Status: New
** Description changed:
We found this issue during testing volume detachment when iSCSI
multipath is used. When a same iSCSI protal and iqn is shared by
multiple LUNs, device from other volume maybe be deleted unexpected.
This is found both in Kilo and the latest code.
+ For example, the devices under /dev/disk/by-path may looks like below when LUN 23 and 231 are from a same storage system and a same iSCSI protal and iqn are used. ls /dev/disk/by-path
+ ip-192.168.3.50:3260-iscsi-<iqna>-lun-23
+ ip-192.168.3.50:3260-iscsi-<iqna>-lun-231
+ ip-192.168.3.51:3260-iscsi-<iqnb>-lun-23
+ ip-192.168.3.51:3260-iscsi-<iqnb>-lun-231
- For example, the devices under /dev/disk/by-path may looks like below when LUN 23 and 231 are from a same storage system and a same iSCSI protal and iqn are used. ls /dev/disk/by-path
- ip-192.168.3.50:3260-iscsi-<iqna>-lun-23 -> ../../sdh
- ip-192.168.3.50:3260-iscsi-<iqna>-lun-231 -> ../../sdk
- ip-192.168.3.51:3260-iscsi-<iqnb>-lun-23 -> ../../sdd
- ip-192.168.3.51:3260-iscsi-<iqnb>-lun-231 -> ../../sdi
+ When we try to detach volume corresponding LUN 23 from the host, we
+ noticed that the devices regarding to LUN 231 are also deleted which may
+ cause the data unavailable.
-
- When we try to detach volume corresponding LUN 23 from the host, the devices regarding to LUN 231 are also deleted which may cause the data unavailable.
-
- Why this happen? After digging into the node code, below is the clue:
+ Why this happen? After digging into the nova code, below is the clue:
nova/virt/libvirt/volume.py
- 770 def _delete_mpath(self, iscsi_properties, multipath_device, ips_iqns):
- 771 entries = self._get_iscsi_devices()
- 772 # Loop through ips_iqns to construct all paths
- 773 iqn_luns = []
- 774 for ip, iqn in ips_iqns:
- 775 iqn_lun = '%s-lun-%s' % (iqn,
- 776 iscsi_properties.get('target_lun', 0))
- 777 iqn_luns.append(iqn_lun)
- 778 for dev in ['/dev/disk/by-path/%s' % dev for dev in entries]:
- 779 for iqn_lun in iqn_luns:
- 780 if iqn_lun in dev: ==> This is incorrect, device for LUN 231 will made this be True.
- 781 self._delete_device(dev)
- 782
- 783 self._rescan_multipath()
+ 770 def _delete_mpath(self, iscsi_properties, multipath_device, ips_iqns):
+ 771 entries = self._get_iscsi_devices()
+ 772 # Loop through ips_iqns to construct all paths
+ 773 iqn_luns = []
+ 774 for ip, iqn in ips_iqns:
+ 775 iqn_lun = '%s-lun-%s' % (iqn,
+ 776 iscsi_properties.get('target_lun', 0))
+ 777 iqn_luns.append(iqn_lun)
+ 778 for dev in ['/dev/disk/by-path/%s' % dev for dev in entries]:
+ 779 for iqn_lun in iqn_luns:
+ 780 if iqn_lun in dev: ==> This is incorrect, device for LUN 231 will made this be True.
+ 781 self._delete_device(dev)
+ 782
+ 783 self._rescan_multipath()
Due to the incorrect logic in line 780, detach LUN xx will deleted devices for other LUNs starts with xx, such as xxy, xxz
===================================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
- Date: Tue May 12 22:46:43 2015 +0000
+ Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
** Description changed:
We found this issue during testing volume detachment when iSCSI
multipath is used. When a same iSCSI protal and iqn is shared by
multiple LUNs, device from other volume maybe be deleted unexpected.
This is found both in Kilo and the latest code.
For example, the devices under /dev/disk/by-path may looks like below when LUN 23 and 231 are from a same storage system and a same iSCSI protal and iqn are used. ls /dev/disk/by-path
ip-192.168.3.50:3260-iscsi-<iqna>-lun-23
ip-192.168.3.50:3260-iscsi-<iqna>-lun-231
ip-192.168.3.51:3260-iscsi-<iqnb>-lun-23
ip-192.168.3.51:3260-iscsi-<iqnb>-lun-231
When we try to detach volume corresponding LUN 23 from the host, we
noticed that the devices regarding to LUN 231 are also deleted which may
cause the data unavailable.
Why this happen? After digging into the nova code, below is the clue:
nova/virt/libvirt/volume.py
770 def _delete_mpath(self, iscsi_properties, multipath_device, ips_iqns):
- 771 entries = self._get_iscsi_devices()
- 772 # Loop through ips_iqns to construct all paths
- 773 iqn_luns = []
- 774 for ip, iqn in ips_iqns:
- 775 iqn_lun = '%s-lun-%s' % (iqn,
- 776 iscsi_properties.get('target_lun', 0))
- 777 iqn_luns.append(iqn_lun)
- 778 for dev in ['/dev/disk/by-path/%s' % dev for dev in entries]:
- 779 for iqn_lun in iqn_luns:
- 780 if iqn_lun in dev: ==> This is incorrect, device for LUN 231 will made this be True.
- 781 self._delete_device(dev)
+ 771 entries = self._get_iscsi_devices()
+ 772 # Loop through ips_iqns to construct all paths
+ 773 iqn_luns = []
+ 774 for ip, iqn in ips_iqns:
+ 775 iqn_lun = '%s-lun-%s' % (iqn,
+ 776 iscsi_properties.get('target_lun', 0))
+ 777 iqn_luns.append(iqn_lun)
+ 778 for dev in ['/dev/disk/by-path/%s' % dev for dev in entries]:
+ 779 for iqn_lun in iqn_luns:
+ 780 if iqn_lun in dev: ==> This is incorrect, device for LUN 231 will made this be True.
+ 781 self._delete_device(dev)
782
- 783 self._rescan_multipath()
+ 783 self._rescan_multipath()
- Due to the incorrect logic in line 780, detach LUN xx will deleted devices for other LUNs starts with xx, such as xxy, xxz
+ Due to the incorrect logic in line 780, detach LUN xx will deleted devices for other LUNs starts with xx, such as xxy, xxz. We could use dev.endswith(iqn_lun) to avoid it.
===================================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
** Changed in: nova
Assignee: (unassigned) => Tina Tang (tina-tang)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1454512
Title:
Device for other volume is deleted unexpected during volume detach
when iscsi multipath is used
Status in OpenStack Compute (Nova):
New
Bug description:
We found this issue during testing volume detachment when iSCSI
multipath is used. When a same iSCSI protal and iqn is shared by
multiple LUNs, device from other volume maybe be deleted unexpected.
This is found both in Kilo and the latest code.
For example, the devices under /dev/disk/by-path may looks like below when LUN 23 and 231 are from a same storage system and a same iSCSI protal and iqn are used. ls /dev/disk/by-path
ip-192.168.3.50:3260-iscsi-<iqna>-lun-23
ip-192.168.3.50:3260-iscsi-<iqna>-lun-231
ip-192.168.3.51:3260-iscsi-<iqnb>-lun-23
ip-192.168.3.51:3260-iscsi-<iqnb>-lun-231
When we try to detach volume corresponding LUN 23 from the host, we
noticed that the devices regarding to LUN 231 are also deleted which
may cause the data unavailable.
Why this happen? After digging into the nova code, below is the clue:
nova/virt/libvirt/volume.py
770 def _delete_mpath(self, iscsi_properties, multipath_device, ips_iqns):
771 entries = self._get_iscsi_devices()
772 # Loop through ips_iqns to construct all paths
773 iqn_luns = []
774 for ip, iqn in ips_iqns:
775 iqn_lun = '%s-lun-%s' % (iqn,
776 iscsi_properties.get('target_lun', 0))
777 iqn_luns.append(iqn_lun)
778 for dev in ['/dev/disk/by-path/%s' % dev for dev in entries]:
779 for iqn_lun in iqn_luns:
780 if iqn_lun in dev: ==> This is incorrect, device for LUN 231 will made this be True.
781 self._delete_device(dev)
782
783 self._rescan_multipath()
Due to the incorrect logic in line 780, detach LUN xx will deleted devices for other LUNs starts with xx, such as xxy, xxz. We could use dev.endswith(iqn_lun) to avoid it.
===================================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1454512/+subscriptions
Follow ups
References