yahoo-eng-team team mailing list archive

Thread
Date
[Bug 1461827] [NEW] Fail to attach volumes using FC multipath

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Masanori Itoh <1461827@xxxxxxxxxxxxxxxxxx>
Date: Thu, 04 Jun 2015 08:05:00 -0000
Reply-to: Bug 1461827 <1461827@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Public bug reported:

* Description

Under FC-SAN multipath configuration, VM instances sometimes fail to
grab volumes using multipath as expected.

Because of this issue:
 - A single haredware failure in the FC Fabric can be exposed to VM instances regardless of physical multipath configuration
 - Performane can be also affected if it's configured active-active balancing policy


* Version

I found this issue while working on a stable/juno based OpenStack
distribution, but I think master still has the same problem .


* How to reproduce

  Anyway, setup a Nova/Cinder deployment using Linux/KVM  with a
multipath FC fabric.

  As I describe below, this problem happens when:

   1) multipathd is down when nova-compute tried to find multipath device
 or
   2) It took long time for multipathd to configure multipath devices.
      For example, a couple of minutes.
      I think this happens by various reasons.
      
* Expected results

   On the compute node hosting the VM in question, by using 'virsh
dumpxml DOMAIN_ID', you can get source path device name(s) of virtual
disk(s)  attached to the VM instance and  check if the disk are
multipath devices or not.

   Under FC-SAN multipath environment, they are expected to be
'/dev/mapper/XXXXX'. For example:

| root@overcloud-ce-novacompute0-novacompute0-ueruxqghm5vm:~# virsh dumpxml 2 | grep dev
|    <boot dev='hd'/>
|  <devices>
|    <disk type='block' device='disk'>
|      <source dev='/dev/mapper/360002ac000000000000000250000ba10'/>
|      <target dev='vda' bus='virtio'/>
|  </devices>


* Actual results

   Among the result of 'virsh dumpxml DOMAIN_ID', you sometimes (in case
of me, often) see non-multipath device path name(s) like the following.

|    <disk type='block' device='disk'>
|      <driver name='qemu' type='raw' cache='none'/>
|      <source dev='/dev/disk/by-path/pci-0000:05:00.0-fc-0x21210002ac00ba10-lun-0'/>
|      <backingStore/>
|      <target dev='vda' bus='virtio'/>
|      <serial>d4d64f3c-bd43-4bc6-8a58-230d677c188b</serial>
|      <alias name='virtio-disk0'/>
|      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
|    </disk>


* Analysis

  I think this comes from Nova volume connection handling code of Fiber
Channel, ‘connect_volume’ method.

  In case of master,
     https://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume.py#L1301
  in case of upstream stable/juno
      https://github.com/openstack/nova/blob/stable/juno/nova/virt/libvirt/volume.py#L1012


 'connect_volume'  method above is in charge of connecting a LUN to host Linux side,
 and here is the problem.

  After an FC storage box exported LUNs to a compute node, it takes time until:
    (1)  SCSI devices are discovered by the host Linux of the compute node
   and then
    (2) 'multipathd' detects and configures multipath devices using device mapper

 ‘connect_volume' retries and waits for above (1), but there is no retry
logic in the above (2).

  Thus, nova-compute service sometimes fails to recognize multipath FC
devices and attaches single path devices to VM instances when it takes
time.


* Resolution / Discussion

  I think we need to add retry logic for detecting and waiting for
multipath  device files in 'connect_volume' method of
LibvirtFibreChannelDriver class of nova.


  In case of failure (timeout to detect multipath devices), there could be several options, I think.

  choice 1) Make the attach_volume request fail.
      If so, which HTTP status code?

  choice 2) Go forward with single path.
      But, from a viewpoint of SLA of a service provider, this is a degradation.
      I'm wondering if it's better to return a HTTP status code other than  HTTP 202 or not. 

  Maybe it's better to allow administrators to choose the expected
behavior  by nova.conf options.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1461827

Title:
  Fail to attach volumes using FC multipath

Status in OpenStack Compute (Nova):
  New

Bug description:
  * Description

  Under FC-SAN multipath configuration, VM instances sometimes fail to
  grab volumes using multipath as expected.

  Because of this issue:
   - A single haredware failure in the FC Fabric can be exposed to VM instances regardless of physical multipath configuration
   - Performane can be also affected if it's configured active-active balancing policy

  
  * Version

  I found this issue while working on a stable/juno based OpenStack
  distribution, but I think master still has the same problem .

  
  * How to reproduce

    Anyway, setup a Nova/Cinder deployment using Linux/KVM  with a
  multipath FC fabric.

    As I describe below, this problem happens when:

     1) multipathd is down when nova-compute tried to find multipath device
   or
     2) It took long time for multipathd to configure multipath devices.
        For example, a couple of minutes.
        I think this happens by various reasons.
        
  * Expected results

     On the compute node hosting the VM in question, by using 'virsh
  dumpxml DOMAIN_ID', you can get source path device name(s) of virtual
  disk(s)  attached to the VM instance and  check if the disk are
  multipath devices or not.

     Under FC-SAN multipath environment, they are expected to be
  '/dev/mapper/XXXXX'. For example:

  | root@overcloud-ce-novacompute0-novacompute0-ueruxqghm5vm:~# virsh dumpxml 2 | grep dev
  |    <boot dev='hd'/>
  |  <devices>
  |    <disk type='block' device='disk'>
  |      <source dev='/dev/mapper/360002ac000000000000000250000ba10'/>
  |      <target dev='vda' bus='virtio'/>
  |  </devices>

  
  * Actual results

     Among the result of 'virsh dumpxml DOMAIN_ID', you sometimes (in
  case of me, often) see non-multipath device path name(s) like the
  following.

  |    <disk type='block' device='disk'>
  |      <driver name='qemu' type='raw' cache='none'/>
  |      <source dev='/dev/disk/by-path/pci-0000:05:00.0-fc-0x21210002ac00ba10-lun-0'/>
  |      <backingStore/>
  |      <target dev='vda' bus='virtio'/>
  |      <serial>d4d64f3c-bd43-4bc6-8a58-230d677c188b</serial>
  |      <alias name='virtio-disk0'/>
  |      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
  |    </disk>

  
  * Analysis

    I think this comes from Nova volume connection handling code of
  Fiber Channel, ‘connect_volume’ method.

    In case of master,
       https://github.com/openstack/nova/blob/master/nova/virt/libvirt/volume.py#L1301
    in case of upstream stable/juno
        https://github.com/openstack/nova/blob/stable/juno/nova/virt/libvirt/volume.py#L1012

  
   'connect_volume'  method above is in charge of connecting a LUN to host Linux side,
   and here is the problem.

    After an FC storage box exported LUNs to a compute node, it takes time until:
      (1)  SCSI devices are discovered by the host Linux of the compute node
     and then
      (2) 'multipathd' detects and configures multipath devices using device mapper

   ‘connect_volume' retries and waits for above (1), but there is no
  retry logic in the above (2).

    Thus, nova-compute service sometimes fails to recognize multipath FC
  devices and attaches single path devices to VM instances when it takes
  time.

  
  * Resolution / Discussion

    I think we need to add retry logic for detecting and waiting for
  multipath  device files in 'connect_volume' method of
  LibvirtFibreChannelDriver class of nova.

  
    In case of failure (timeout to detect multipath devices), there could be several options, I think.

    choice 1) Make the attach_volume request fail.
        If so, which HTTP status code?

    choice 2) Go forward with single path.
        But, from a viewpoint of SLA of a service provider, this is a degradation.
        I'm wondering if it's better to return a HTTP status code other than  HTTP 202 or not. 

    Maybe it's better to allow administrators to choose the expected
  behavior  by nova.conf options.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1461827/+subscriptions
Follow ups

[Bug 1461827] Re: Fail to attach volumes using FC multipath
From: Launchpad Bug Tracker, 2015-11-21
[Bug 1461827] [NEW] Fail to attach volumes using FC multipath
From: Masanori Itoh, 2015-06-04
References

[Bug 1461827] [NEW] Fail to attach volumes using FC multipath
From: Masanori Itoh, 2015-06-04