yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #32886
[Bug 1454978] [NEW] [iSCSI Multipath]Thousands of multipath -ll <mp-id > are executed during volume detachment when multiple LUNs are exposed on a same target
Public bug reported:
iSCSI multipath has performance issue on volume detachment when multiple
LUNs are exposed via single target(iqn).
1. We are using VNX as cinder backends. VNX is exposing multiple LUNs
via a iqn. And a LUN is exposed via different iqns for multipathing.
Libvirt driver is used in nova. And the virt_type is kvm.
2. After we attached 100 volumes to VMs, and then do volume detachment
in batch, we noticed that thousands of "multipath -ll <mp_id>" are
executed per a volume detachement. In out enviornment, a "multipath -ll
<mp_id>" takes about 0.2s, the performance is bad.
3. Why there are so many "multipath -ll <mp-id>" triggerred?
In order to find all pathes of a multipath device, the code went through all the devices under /dev/disk/by-path which used the same iqn and execute ‘multipath –ll’ on each of them to get the multipath id. When the multipath id of a device is the same as the volume to be detached. Then it is a path of the volume. When each iqn only expose one LUN, this code do not expose performance issue. However, when multiple luns are expose via a single iqn, the problems comes out.
Assuming taht we have n LUNs attached. Each LUN has m iqns for multipathing, then there will be m*n devices under /dev/disk/by-path. And they are sharing m iqns. Then,
-- Code line 623- 644 will trigger o(n*m) times of "multipath -ll <mp-id>"
-- Code line 648-649 will trigger o(!m) times of "multipath -ll <mp-id>"
nova/nova/virt/libvirt/volume.py
LibvirtISCSIVolumeDriver._disconnect_volume_multipath_iscsi
618 out = self._run_iscsiadm_discover(iscsi_properties)
619
620 # Extract targets for the current multipath device.
621 ips_iqns = []
622 entries = self._get_iscsi_devices()
623 for ip, iqn in self._get_target_portals_from_iscsiadm_output(out):
624 ip_iqn = "%s-iscsi-%s" % (ip.split(",")[0], iqn)
625 for entry in entries:
626 entry_ip_iqn = entry.split("-lun-")[0]
627 if entry_ip_iqn[:3] == "ip-":
628 entry_ip_iqn = entry_ip_iqn[3:]
629 elif entry_ip_iqn[:4] == "pci-":
630 # Look at an offset of len('pci-0000:00:00.0')
631 offset = entry_ip_iqn.find("ip-", 16, 21)
632 entry_ip_iqn = entry_ip_iqn[(offset + 3):]
633 if (ip_iqn != entry_ip_iqn):
634 continue
635 entry_real_path = os.path.realpath("/dev/disk/by-path/%s" %
636 entry)
637 entry_mpdev = self._get_multipath_device_name(entry_real_path)
638 if entry_mpdev == multipath_device:
639 ips_iqns.append([ip, iqn])
640 break
641
642 if not devices:
643 # disconnect if no other multipath devices
644 self._disconnect_mpath(iscsi_properties, ips_iqns)
645 return
646
647 # Get a target for all other multipath devices
648 other_iqns = [self._get_multipath_iqn(device)
649 for device in devices]
====================Code version =====================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
** Affects: nova
Importance: Undecided
Status: New
** Description changed:
- iSCSI multipath has performance issue on volume detachment when multiple LUNs are exposed via single target(iqn).
- 1. We am using VNX as cinder backends. VNX is exposing multiple LUNs via a iqn. And a LUN is exposed via different iqns for multipathing. Libvirt driver is used in nova. And the virt_type is kvm.
+ iSCSI multipath has performance issue on volume detachment when multiple
+ LUNs are exposed via single target(iqn).
+
+ 1. We are using VNX as cinder backends. VNX is exposing multiple LUNs
+ via a iqn. And a LUN is exposed via different iqns for multipathing.
+ Libvirt driver is used in nova. And the virt_type is kvm.
2. After we attached 100 volumes to VMs, and then do volume detachment
in batch, we noticed that thousands of "multipath -ll <mp_id>" are
executed per a volume detachement. In out enviornment, a "multipath -ll
<mp_id>" takes about 0.2s, the performance is bad.
3. Why there are so many "multipath -ll <mp-id>" triggerred?
- In order to find all pathes of a multipath device, the code went through all the devices under /dev/disk/by-path which used the same iqn and execute ‘multipath –ll’ on each of them to get the multipath id. When the multipath id of a device is the same as the volume to be detached. Then it is a path of the volume. When each iqn only expose one LUN, this code do not expose performance issue. However, when multiple luns are expose via a single iqn, the problems comes out.
+ In order to find all pathes of a multipath device, the code went through all the devices under /dev/disk/by-path which used the same iqn and execute ‘multipath –ll’ on each of them to get the multipath id. When the multipath id of a device is the same as the volume to be detached. Then it is a path of the volume. When each iqn only expose one LUN, this code do not expose performance issue. However, when multiple luns are expose via a single iqn, the problems comes out.
Assuming taht we have n LUNs attached. Each LUN has m iqns for multipathing, then there will be m*n devices under /dev/disk/by-path. And they are sharing m iqns. Then,
- -- Code line 623- 644 will trigger o(n*m) times of "multipath -ll <mp-id>"
- -- Code line 648-649 will trigger o(!m) times of "multipath -ll <mp-id>"
+ -- Code line 623- 644 will trigger o(n*m) times of "multipath -ll <mp-id>"
+ -- Code line 648-649 will trigger o(!m) times of "multipath -ll <mp-id>"
nova/nova/virt/libvirt/volume.py
LibvirtISCSIVolumeDriver._disconnect_volume_multipath_iscsi
- 618 out = self._run_iscsiadm_discover(iscsi_properties)
+ 618 out = self._run_iscsiadm_discover(iscsi_properties)
619
- 620 # Extract targets for the current multipath device.
- 621 ips_iqns = []
- 622 entries = self._get_iscsi_devices()
- 623 for ip, iqn in self._get_target_portals_from_iscsiadm_output(out):
- 624 ip_iqn = "%s-iscsi-%s" % (ip.split(",")[0], iqn)
- 625 for entry in entries:
- 626 entry_ip_iqn = entry.split("-lun-")[0]
- 627 if entry_ip_iqn[:3] == "ip-":
- 628 entry_ip_iqn = entry_ip_iqn[3:]
- 629 elif entry_ip_iqn[:4] == "pci-":
- 630 # Look at an offset of len('pci-0000:00:00.0')
- 631 offset = entry_ip_iqn.find("ip-", 16, 21)
- 632 entry_ip_iqn = entry_ip_iqn[(offset + 3):]
- 633 if (ip_iqn != entry_ip_iqn):
- 634 continue
- 635 entry_real_path = os.path.realpath("/dev/disk/by-path/%s" %
- 636 entry)
- 637 entry_mpdev = self._get_multipath_device_name(entry_real_path)
- 638 if entry_mpdev == multipath_device:
- 639 ips_iqns.append([ip, iqn])
- 640 break
+ 620 # Extract targets for the current multipath device.
+ 621 ips_iqns = []
+ 622 entries = self._get_iscsi_devices()
+ 623 for ip, iqn in self._get_target_portals_from_iscsiadm_output(out):
+ 624 ip_iqn = "%s-iscsi-%s" % (ip.split(",")[0], iqn)
+ 625 for entry in entries:
+ 626 entry_ip_iqn = entry.split("-lun-")[0]
+ 627 if entry_ip_iqn[:3] == "ip-":
+ 628 entry_ip_iqn = entry_ip_iqn[3:]
+ 629 elif entry_ip_iqn[:4] == "pci-":
+ 630 # Look at an offset of len('pci-0000:00:00.0')
+ 631 offset = entry_ip_iqn.find("ip-", 16, 21)
+ 632 entry_ip_iqn = entry_ip_iqn[(offset + 3):]
+ 633 if (ip_iqn != entry_ip_iqn):
+ 634 continue
+ 635 entry_real_path = os.path.realpath("/dev/disk/by-path/%s" %
+ 636 entry)
+ 637 entry_mpdev = self._get_multipath_device_name(entry_real_path)
+ 638 if entry_mpdev == multipath_device:
+ 639 ips_iqns.append([ip, iqn])
+ 640 break
641
- 642 if not devices:
- 643 # disconnect if no other multipath devices
- 644 self._disconnect_mpath(iscsi_properties, ips_iqns)
- 645 return
+ 642 if not devices:
+ 643 # disconnect if no other multipath devices
+ 644 self._disconnect_mpath(iscsi_properties, ips_iqns)
+ 645 return
646
- 647 # Get a target for all other multipath devices
- 648 other_iqns = [self._get_multipath_iqn(device)
- 649 for device in devices]
+ 647 # Get a target for all other multipath devices
+ 648 other_iqns = [self._get_multipath_iqn(device)
+ 649 for device in devices]
-
-
====================Code version =====================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
- Date: Tue May 12 22:46:43 2015 +0000
+ Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
** Description changed:
iSCSI multipath has performance issue on volume detachment when multiple
LUNs are exposed via single target(iqn).
1. We are using VNX as cinder backends. VNX is exposing multiple LUNs
via a iqn. And a LUN is exposed via different iqns for multipathing.
Libvirt driver is used in nova. And the virt_type is kvm.
2. After we attached 100 volumes to VMs, and then do volume detachment
in batch, we noticed that thousands of "multipath -ll <mp_id>" are
executed per a volume detachement. In out enviornment, a "multipath -ll
<mp_id>" takes about 0.2s, the performance is bad.
3. Why there are so many "multipath -ll <mp-id>" triggerred?
In order to find all pathes of a multipath device, the code went through all the devices under /dev/disk/by-path which used the same iqn and execute ‘multipath –ll’ on each of them to get the multipath id. When the multipath id of a device is the same as the volume to be detached. Then it is a path of the volume. When each iqn only expose one LUN, this code do not expose performance issue. However, when multiple luns are expose via a single iqn, the problems comes out.
Assuming taht we have n LUNs attached. Each LUN has m iqns for multipathing, then there will be m*n devices under /dev/disk/by-path. And they are sharing m iqns. Then,
-- Code line 623- 644 will trigger o(n*m) times of "multipath -ll <mp-id>"
-- Code line 648-649 will trigger o(!m) times of "multipath -ll <mp-id>"
nova/nova/virt/libvirt/volume.py
LibvirtISCSIVolumeDriver._disconnect_volume_multipath_iscsi
618 out = self._run_iscsiadm_discover(iscsi_properties)
619
620 # Extract targets for the current multipath device.
621 ips_iqns = []
622 entries = self._get_iscsi_devices()
623 for ip, iqn in self._get_target_portals_from_iscsiadm_output(out):
- 624 ip_iqn = "%s-iscsi-%s" % (ip.split(",")[0], iqn)
- 625 for entry in entries:
- 626 entry_ip_iqn = entry.split("-lun-")[0]
- 627 if entry_ip_iqn[:3] == "ip-":
- 628 entry_ip_iqn = entry_ip_iqn[3:]
- 629 elif entry_ip_iqn[:4] == "pci-":
- 630 # Look at an offset of len('pci-0000:00:00.0')
- 631 offset = entry_ip_iqn.find("ip-", 16, 21)
- 632 entry_ip_iqn = entry_ip_iqn[(offset + 3):]
- 633 if (ip_iqn != entry_ip_iqn):
- 634 continue
- 635 entry_real_path = os.path.realpath("/dev/disk/by-path/%s" %
- 636 entry)
- 637 entry_mpdev = self._get_multipath_device_name(entry_real_path)
- 638 if entry_mpdev == multipath_device:
- 639 ips_iqns.append([ip, iqn])
- 640 break
+ 624 ip_iqn = "%s-iscsi-%s" % (ip.split(",")[0], iqn)
+ 625 for entry in entries:
+ 626 entry_ip_iqn = entry.split("-lun-")[0]
+ 627 if entry_ip_iqn[:3] == "ip-":
+ 628 entry_ip_iqn = entry_ip_iqn[3:]
+ 629 elif entry_ip_iqn[:4] == "pci-":
+ 630 # Look at an offset of len('pci-0000:00:00.0')
+ 631 offset = entry_ip_iqn.find("ip-", 16, 21)
+ 632 entry_ip_iqn = entry_ip_iqn[(offset + 3):]
+ 633 if (ip_iqn != entry_ip_iqn):
+ 634 continue
+ 635 entry_real_path = os.path.realpath("/dev/disk/by-path/%s" %
+ 636 entry)
+ 637 entry_mpdev = self._get_multipath_device_name(entry_real_path)
+ 638 if entry_mpdev == multipath_device:
+ 639 ips_iqns.append([ip, iqn])
+ 640 break
641
642 if not devices:
- 643 # disconnect if no other multipath devices
- 644 self._disconnect_mpath(iscsi_properties, ips_iqns)
- 645 return
+ 643 # disconnect if no other multipath devices
+ 644 self._disconnect_mpath(iscsi_properties, ips_iqns)
+ 645 return
646
647 # Get a target for all other multipath devices
648 other_iqns = [self._get_multipath_iqn(device)
- 649 for device in devices]
+ 649 for device in devices]
====================Code version =====================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1454978
Title:
[iSCSI Multipath]Thousands of multipath -ll <mp-id > are executed
during volume detachment when multiple LUNs are exposed on a same
target
Status in OpenStack Compute (Nova):
New
Bug description:
iSCSI multipath has performance issue on volume detachment when
multiple LUNs are exposed via single target(iqn).
1. We are using VNX as cinder backends. VNX is exposing multiple LUNs
via a iqn. And a LUN is exposed via different iqns for multipathing.
Libvirt driver is used in nova. And the virt_type is kvm.
2. After we attached 100 volumes to VMs, and then do volume detachment
in batch, we noticed that thousands of "multipath -ll <mp_id>" are
executed per a volume detachement. In out enviornment, a "multipath
-ll <mp_id>" takes about 0.2s, the performance is bad.
3. Why there are so many "multipath -ll <mp-id>" triggerred?
In order to find all pathes of a multipath device, the code went through all the devices under /dev/disk/by-path which used the same iqn and execute ‘multipath –ll’ on each of them to get the multipath id. When the multipath id of a device is the same as the volume to be detached. Then it is a path of the volume. When each iqn only expose one LUN, this code do not expose performance issue. However, when multiple luns are expose via a single iqn, the problems comes out.
Assuming taht we have n LUNs attached. Each LUN has m iqns for multipathing, then there will be m*n devices under /dev/disk/by-path. And they are sharing m iqns. Then,
-- Code line 623- 644 will trigger o(n*m) times of "multipath -ll <mp-id>"
-- Code line 648-649 will trigger o(!m) times of "multipath -ll <mp-id>"
nova/nova/virt/libvirt/volume.py
LibvirtISCSIVolumeDriver._disconnect_volume_multipath_iscsi
618 out = self._run_iscsiadm_discover(iscsi_properties)
619
620 # Extract targets for the current multipath device.
621 ips_iqns = []
622 entries = self._get_iscsi_devices()
623 for ip, iqn in self._get_target_portals_from_iscsiadm_output(out):
624 ip_iqn = "%s-iscsi-%s" % (ip.split(",")[0], iqn)
625 for entry in entries:
626 entry_ip_iqn = entry.split("-lun-")[0]
627 if entry_ip_iqn[:3] == "ip-":
628 entry_ip_iqn = entry_ip_iqn[3:]
629 elif entry_ip_iqn[:4] == "pci-":
630 # Look at an offset of len('pci-0000:00:00.0')
631 offset = entry_ip_iqn.find("ip-", 16, 21)
632 entry_ip_iqn = entry_ip_iqn[(offset + 3):]
633 if (ip_iqn != entry_ip_iqn):
634 continue
635 entry_real_path = os.path.realpath("/dev/disk/by-path/%s" %
636 entry)
637 entry_mpdev = self._get_multipath_device_name(entry_real_path)
638 if entry_mpdev == multipath_device:
639 ips_iqns.append([ip, iqn])
640 break
641
642 if not devices:
643 # disconnect if no other multipath devices
644 self._disconnect_mpath(iscsi_properties, ips_iqns)
645 return
646
647 # Get a target for all other multipath devices
648 other_iqns = [self._get_multipath_iqn(device)
649 for device in devices]
====================Code version =====================
stack@openstack-performance:~/tina/nova_iscsi_mp/nova$ git log -1
commit f4504f3575b35ec14390b4b678e441fcf953f47b
Merge: 3f21f60 5fbd852
Author: Jenkins <jenkins@xxxxxxxxxxxxxxxxxxxx>
Date: Tue May 12 22:46:43 2015 +0000
Merge "Remove db layer hard-code permission checks for
network_get_all_by_host"
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1454978/+subscriptions
Follow ups
References