← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1793259] Re: sg_scan returns wrong HLU number if it's greater than 255

 

** Changed in: os-brick
       Status: New => Confirmed

** Also affects: nova
   Importance: Undecided
       Status: New

** Changed in: nova
       Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793259

Title:
  sg_scan returns wrong HLU number if it's greater than 255

Status in OpenStack Compute (nova):
  Confirmed
Status in os-brick:
  Confirmed

Bug description:
  The tempest test case
  'test_volumes_extend.VolumesExtendAttachedTest.test_extend_attached_volume'
  will fail if the volume created is assigned a HLU number greater than
  255.

  The cause is that os-brick uses 'sg_scan' to get device information(H:C:T:L)
  ======
   92     def get_device_info(self, device):
   93         (out, _err) = self._execute('sg_scan', device, run_as_root=True,
   94                                     root_helper=self._root_helper)
   95         dev_info = {'device': device, 'host': None,
   96                     'channel': None, 'id': None, 'lun': None}
   97         if out:
   98             line = out.strip()
   99             line = line.replace(device + ": ", "")
  100             info = line.split(" ")
  101
  102             for item in info:
  103                 if '=' in item:
  104                     pair = item.split('=')
  105                     dev_info[pair[0]] = pair[1]
  106                 elif 'scsi' in item:
  107                     dev_info['host'] = item.replace('scsi', '')
  108
  109         return dev_info
  ======

  sg_scan uses 'ioctl SCSI_IOCTL_GET_IDLUN' to get this device information.
  https://github.com/hreinecke/sg3_utils/blob/master/src/sg_scan_linux.c#L321  
  ======
          res = ioctl(sg_fd, SCSI_IOCTL_GET_IDLUN, &my_idlun);
  ...
          printf("%s: scsi%d channel=%d id=%d lun=%d", file_namep, host_no,
                 (my_idlun.dev_id >> 16) & 0xff, my_idlun.dev_id & 0xff,
                 (my_idlun.dev_id >> 8) & 0xff); # <--- only 8-bit represents the device id.
  ======

  however the device_id that sg_scan can return is only one byte which
  means it can only return number lower than 255.

  Below is an example.
  ======
  here's a device
  ------
  # multipath -ll 3600601602220440062449f5b82796e05
  3600601602220440062449f5b82796e05 dm-6 DGC     ,VRAID
  size=1.0G features='1 retain_attached_hw_handler' hwhandler='1 alua' wp=rw
  |-+- policy='round-robin 0' prio=50 status=active
  | `- 6:0:0:15797 sdm 8:192 active ready running
  `-+- policy='round-robin 0' prio=10 status=enabled
    `- 7:0:2:15797 sdn 8:208 active ready running
  ------

  we can see that the device id is 15797.
  this number can also be got using 'lsscsi'
  ------
  # lsscsi |grep 15797
  [6:0:0:15797] disk    DGC      VRAID            4400  /dev/sdm
  [7:0:2:15797] disk    DGC      VRAID            4400  /dev/sdn
  ------

  However sg_scan returned different device id
  ------
  # sg_scan -i /dev/sdm
  /dev/sdm: scsi6 channel=0 id=0 lun=181 [em]
      DGC       VRAID             4400 [rmb=0 cmdq=1 pqual=0 pdev=0x0]
  # sg_scan -i /dev/sdn
  /dev/sdn: scsi7 channel=0 id=2 lun=181 [em]
      DGC       VRAID             4400 [rmb=0 cmdq=1 pqual=0 pdev=0x0]
  ------
  The device id returned by sg_scan is 181.

  By doing some simple binary calculation, we can find out that 181 is the last 8-bit of 15797
  ------
  15797 && 0xff = 0b11110110110101 && 0x11111111 = 0b10110101 = 181
  -----

  Since wrong device id is returned by sg_scan, when os-brick tries to
  scan a non-exist device it will cause extend_attached_volume to fail.

  We should use 'lsscsi' to get device id.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793259/+subscriptions