← Back to team overview

openstack team mailing list archive

Re: Libvirt iSCSI client: duplicit connection_info data

 

On Mar 20, 2013, at 3:39 AM, Brano Zarnovican <zarnovican@xxxxxxxxx> wrote:

> Hi devs,
> 
> we are using backend iSCSI provider (Netapp) which is mapping
> Openstack volumes to iSCSI LUNs. This mapping is not static and
> changes over time. For example when the volume is detached then his
> LUN id becomes unused. After a while a _different_ volume may get the
> same LUN id, as Netapp is recycling them. This is expected behavior..
> 
> As a result, there may be entries in "block_device_mapping" with
> identical connection_info..
> connection_info: {"driver_volume_type": "iscsi", "data":
> {"target_lun": "5", .. "target_iqn":
> "iqn.1992-08.com.netapp:node.netapp02", volume_id": 1806}}
> connection_info: {"driver_volume_type": "iscsi", "data":
> {"target_lun": "5", .. "target_iqn":
> "iqn.1992-08.com.netapp:node.netapp02", volume_id": 2227}}
> Zero or one of them may be attached, the rest is in detached state.
> 
> As a fix to address #1112483, I'm deleting the device when it is being
> disconnected (echo 1 > /sys/block/sdg/device/delete).
> 
> Trouble is that OpenStack seems to expect the disconnect_volume to be
> idempotent (_cleanup() method). That is, calling disconnect_volume on
> detached volume will do nothing. However, because of the LUN reuse,
> the id may now be mapped to a different volume. Caller is asking me to
> disconnect volume with LUN5. From just looking at the device name
> there is no way of telling which openstack volume it is.
> 
> /dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-5
> -> ../../sdg
> 
> How to get out of this .. ?
> 
> 1) Do not call 'disconnect_volume' for volumes that were successfully
> disconnected before. In other words, disconnect_volume is not
> idempotent anymore.

I'd really like to keep this idempotent to deal with double delete
races.
> 
> 2) Wipeout connection_info after disconnect. At least for Netapp
> provider it makes no sense to retain the info which is no longer valid
> anyway.

This seems reasonable. In fact, the whole block_device_mapping item
can be deleted after disconnect. I need a little more context to
understand if this will actually help the issue that you are seeing
though. The double disconnects are usually very close together, so
there shouldn't be a new lun assigned in between two of them anyway.
Have you identified a case where a second disconnect is called much
later?

> 
> 3) do not reuse LUN ids - this would require major driver change to
> keep track of all currently used LUNs for both attached and detached
> volumes
> 
> 4) store "somewhere" on the host system mapping between LUNs and
> openstack volumes. You could check against it, before disconnecting a
> LUN device
> 
> None of the options is too pleasant. Any suggestions how to address
> the problem ?
> 
> Regards,
> 
> Brano Zarnovican
> 
> PS: We are using Essex. LUN reusing is a feature of Netapp that exists
> in all versions of the driver (IMO). By a quick glance I think the
> same problem with disconnect_volume exists on Folsom and master
> branch.
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp



Follow ups

References