← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1112483] Re: LibvirtISCSIVolumeDriver: device size mismatch when LUN is reused

 

** Changed in: nova
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1112483

Title:
  LibvirtISCSIVolumeDriver: device size mismatch when LUN is reused

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  Short problem summary:
  ====================

  When LUN id is reused by SCSI provider, it may cause device size
  mismatch on the compute node. Host may report to guest the device size
  corresponding to volume previously mapped to this LUN id, not the
  device that is mapped there now. This happens for SCSI providers that
  use one target with many LUNs (eg Netapp).

  Detailed problem description:
  ========================

  Openstack iSCSI client in disconnect_volume() will call iscsiadm with
  --logout only if nobody else is using LUNs from that target.
  Otherwise, it will do nothing. Device stays there..

  # ls -l /dev/disk/by-path/ip-172.30.128.3\:3260-iscsi-iqn.1992-08.com.netapp\:node.netapp02-lun-0
  lrwxrwxrwx. 1 root root 9 Feb  1 11:06 /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-0 -> ../../sdg

  Later, nova-volume will unmap LUN from the initiator. This devices
  becomes invalid. Example "sanlun" output:

  # sanlun lun show
  controller(7mode)/                                                                               device          host                  lun    
  vserver(Cmode)       lun-pathname                                                                filename        adapter    protocol   size    mode 
  ---------------------------------------------------------------------------------------------------------------------------------------------------
  1081809-413161-N2    <unknown>                                                                   /dev/sdg        host7      iSCSI              7    

  At some point, a different volume needs to be made available to the
  same compute node. Remote SCSI provider may choose to recycle an
  unused LUN id. From client's point of view, a different Openstack
  volume is visible under the same target and LUN id (as used before).
  After nova-volume completed LUN mapping, nova-compute's
  connect_volume() is called. Note that, at this point, iSCSI session to
  the target is up and device symlink (/dev/disk/by-path/..) exists.
  Openstack iSCSI driver will call "iscsiadm .. --login" (with no
  effect). Rescan is not called, because the device exists. Libvirt and
  VM will start to use device..

  Access to the re-mapped device will produce 
  Feb  1 11:06:45 prod-cmp10 kernel: sd 7:0:0:1: [sdh] Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.
  Feb  1 11:06:45 prod-cmp10 kernel: sd 7:0:0:0: [sdg] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
  Feb  1 11:06:45 prod-cmp10 kernel: sd 7:0:0:0: [sdg] Sense Key : Illegal Request [current] 
  Feb  1 11:06:45 prod-cmp10 kernel: Info fld=0x0

  For some strange reason, kernel reports the warning on device that did
  NOT change ("sdh" vs "sdg"). Possible bug in Linux iSCSI client ?

  This issue affects SCSI systems where there are targets with multiple
  LUNs (eg Netapp). Openstack implementation on LVM/tgtd backend is not
  affected because there are multiple targets with single LUN. When the
  LUN becomes unused, driver will close the whole session.

  Steps to reproduce:
  ================

  1) create tree volumes with different sizes (1, 2, 3GB)

  # euca-describe-volumes vol-00000551 vol-00000552 vol-00000553
  VOLUME	vol-00000551	 1		na.dev-netapp	available	2013-02-01T09:32:01.000Z
  VOLUME	vol-00000552	 2		na.dev-netapp	available	2013-02-01T09:32:07.000Z
  VOLUME	vol-00000553	 3		na.dev-netapp	available	2013-02-01T09:32:13.000Z

  2) attach volumes 3G, 2G to an instance

  compute node# virsh domblklist i-000005dc
  Target     Source
  ------------------------------------------------
  ...
  vdc        /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-0
  vdd        /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-1

  instance# lsblk
  NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  ...
  vdc    252:32   0   3G  0 disk 
  vdd    252:48   0   2G  0 disk 

  3) detach volume 3G (LUN0 becomes unused)

  Device still exists

  # ls -l /dev/disk/by-path/
  ...
  lrwxrwxrwx. 1 root root  9 Feb  1 10:46 ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-0 -> ../../sdg
  lrwxrwxrwx. 1 root root  9 Feb  1 10:44 ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-1 -> ../../sdh

  # sanlun lun show
  controller(7mode)/                                                                               device          host                  lun    
  vserver(Cmode)       lun-pathname                                                                filename        adapter    protocol   size    mode 
  ---------------------------------------------------------------------------------------------------------------------------------------------------
  1081809-413161-N2    /vol/OpenStack_103a49bb861e485ea05aa78f9b0216bd_1/vol-00000552/vol-00000552 /dev/sdh        host7      iSCSI      2g      7    
  1081809-413161-N2    <unknown>                                                                   /dev/sdg        host7      iSCSI              7    

  4) attach volume 1G to the same instance (LUN0 is reused for different
  volume)

  Expected result:
  Instance can see new 1G device attached

  Actual result:
  Instance is reporting the size to be 3G.

  Host OS is also reporting 3G. SCSI tools report correct size (1G).

  instance# lsblk
  NAME   MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  ...
  vdc    252:32   0   3G  0 disk 
  vdd    252:48   0   2G  0 disk 

  compute# virsh domblklist i-000005dc
  Target     Source
  ------------------------------------------------
  ...
  vdc        /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-0
  vdd        /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-1

  compute# ls -l /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-?
  lrwxrwxrwx. 1 root root 9 Feb  1 10:47 /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-0 -> ../../sdg
  lrwxrwxrwx. 1 root root 9 Feb  1 10:47 /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-1 -> ../../sdh

  compute# lsblk
  ...
  sdg                                        8:96   0     3G  0 disk 
  sdh                                        8:112  0     2G  0 disk 

  compute# sanlun lun show
  controller(7mode)/                                                                               device          host                  lun    
  vserver(Cmode)       lun-pathname                                                                filename        adapter    protocol   size    mode 
  ---------------------------------------------------------------------------------------------------------------------------------------------------
  1081809-413161-N2    /vol/OpenStack_103a49bb861e485ea05aa78f9b0216bd_1/vol-00000552/vol-00000552 /dev/sdh        host7      iSCSI      2g      7    
  1081809-413161-N2    /vol/OpenStack_103a49bb861e485ea05aa78f9b0216bd_1/vol-00000551/vol-00000551 /dev/sdg        host7      iSCSI      1g      7    

  
  I'm attaching also more outputs with preserved formatting (outputs.txt) ..

  Regards,

  Brano Zarnovican

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1112483/+subscriptions