openstack team mailing list archive
-
openstack team
-
Mailing list archive
-
Message #22097
Libvirt iSCSI client: duplicit connection_info data
Hi devs,
we are using backend iSCSI provider (Netapp) which is mapping
Openstack volumes to iSCSI LUNs. This mapping is not static and
changes over time. For example when the volume is detached then his
LUN id becomes unused. After a while a _different_ volume may get the
same LUN id, as Netapp is recycling them. This is expected behavior..
As a result, there may be entries in "block_device_mapping" with
identical connection_info..
connection_info: {"driver_volume_type": "iscsi", "data":
{"target_lun": "5", .. "target_iqn":
"iqn.1992-08.com.netapp:node.netapp02", volume_id": 1806}}
connection_info: {"driver_volume_type": "iscsi", "data":
{"target_lun": "5", .. "target_iqn":
"iqn.1992-08.com.netapp:node.netapp02", volume_id": 2227}}
Zero or one of them may be attached, the rest is in detached state.
As a fix to address #1112483, I'm deleting the device when it is being
disconnected (echo 1 > /sys/block/sdg/device/delete).
Trouble is that OpenStack seems to expect the disconnect_volume to be
idempotent (_cleanup() method). That is, calling disconnect_volume on
detached volume will do nothing. However, because of the LUN reuse,
the id may now be mapped to a different volume. Caller is asking me to
disconnect volume with LUN5. From just looking at the device name
there is no way of telling which openstack volume it is.
/dev/disk/by-path/ip-172.30.128.3:3260-iscsi-iqn.1992-08.com.netapp:node.netapp02-lun-5
-> ../../sdg
How to get out of this .. ?
1) Do not call 'disconnect_volume' for volumes that were successfully
disconnected before. In other words, disconnect_volume is not
idempotent anymore.
2) Wipeout connection_info after disconnect. At least for Netapp
provider it makes no sense to retain the info which is no longer valid
anyway.
3) do not reuse LUN ids - this would require major driver change to
keep track of all currently used LUNs for both attached and detached
volumes
4) store "somewhere" on the host system mapping between LUNs and
openstack volumes. You could check against it, before disconnecting a
LUN device
None of the options is too pleasant. Any suggestions how to address
the problem ?
Regards,
Brano Zarnovican
PS: We are using Essex. LUN reusing is a feature of Netapp that exists
in all versions of the driver (IMO). By a quick glance I think the
same problem with disconnect_volume exists on Folsom and master
branch.
Follow ups