← Back to team overview

openstack-volume team mailing list archive

Netapp: DFM refresh lun list - possible race issue

 

Hi,

I stumbled upon an issue with DFM LUN list and how it is reflected in
self.discovered_luns list. Before filing a ticket, I might first ask
if anybody have seen such problem.

Background:
During create volume from snapshot, the Netapp driver will create a
new LUN (openstack volume) by cloning existing LUN (openstack
snapshot).

    def create_volume_from_snapshot(self, volume, snapshot):
        ...
        self._clone_lun(lun.HostId, src_path, dest_path, False)
        self._refresh_dfm_luns(lun.HostId)
        self._discover_dataset_luns(dataset, clone_name)

1) _clone_lun - will create new LUN on the filer
2) _refresh_dfm_luns - asks DFM to refresh his LUN list by querying
filer 'HostId'. This call will block until DFM refresh will finish.
3) _discover_dataset_luns - read the list of LUNs from DFM and update
internal self.discovered_luns

Problem:
After "_refresh_dfm_luns" finishes, DFM is still reporting the LUN
list _without_ the LUN that was just created. This is happening
sporadically, in my case it's about 10-15%. When the new LUN is
missing in "self.discovered_luns", subsequent "create_export" will
bomb-out with "Error: No entry in LUN table for volume ..".

Notes:
I have also tested this by creating LUNs manually and running code
similar to _refresh_dfm_luns / _discover_dataset_luns, with the same
results.

The driver's code looks correct to me. It seems that it is DFM who
cannot guarantee that his LUN list is up-to-date. I have a suspicion
that explicit refresh jobs (from the driver) may be interfering with
internal (croned) DFM refresh jobs.

Workaround:
I'm thinking about wrapping steps 2) 3) with a loop and test if the
cloned LUN is on the discovered_lun list or not. Even if the first
refresh/discover will return out-of-date data, the second seems to be
fine.

Anybody has seen this ? Or might have a better idea how to workaround it ?

Regards,

Brano Zarnovican

PS: Netapp driver (7-mode) is latest from Folsom branch, DFM version
5.1, filer OnTAP 7.3.6P5.