← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1516758] [NEW] synchronization problem in libvirt's remotefs volume drivers

 

Public bug reported:

Remotefs drivers have to mount a filesystem while connecting a new
volume and unmount it eventually. They do it with code like:


connect_volume:
if not is_mounted():
    do_mount()


disconnect_volume:
    try:
        umount()
    except:
        if error is not 'fs is busy':
            raise


There is a race here - someone can umount fs between "if not is_mounted():" and  "do_mount()".
I think there should be sort of reference counting, so that disconnect_volume will not unmount fs, is some instances use it.


The simple testcase:

1. Configure cinder to use nfs driver

2. Create 2 volume from an image
cinder create --image <image_id> 4
cinder create --image <image_id> 4

3. boot 2 instances from these volumes
nova boot inst1 --flavor m1.vz --block-device id=<vol1 id>,source=volume,dest=volume,bootindex=0
nova boot inst2 --flavor m1.vz --block-device id=<vol2 id>,source=volume,dest=volume,bootindex=0

4. Suspend first instance
nova suspend inst1

5. delete second instance
nova delete inst2

6. resume first instance
nova resume inst1

The error should appear

] Setting instance vm_state to ERROR
] Traceback (most recent call last):
]   File "/opt/stack/nova/nova/compute/manager.py", line 6374, in _error_out_instance_on_exception
]     yield
]   File "/opt/stack/nova/nova/compute/manager.py", line 4146, in resume_instance
]     block_device_info)
]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2386, in resume
]     vifs_already_plugged=True)
]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4577, in _create_domain_and_network
]     xml, pause=pause, power_on=power_on)
]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4507, in _create_domain
]     guest.launch(pause=pause)
]   File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in launch
]     self._encoded_xml, errors='ignore')
]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in __exit__
]     six.reraise(self.type_, self.value, self.tb)
]   File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 136, in launch
]     return self._domain.createWithFlags(flags)
]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
]     result = proxy_call(self._autowrap, f, *args, **kwargs)
]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
]     rv = execute(f, *args, **kwargs)
]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
]     six.reraise(c, e, tb)
]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
]     rv = meth(*args, **kwargs)
]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1000, in createWithFlags
]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
] libvirtError: Cannot access storage file '/opt/stack/data/nova/mnt/9f23aa85a377c87a8ad6b6462e329905/volume-97bfb953-5bc3-4dbe-b267-c9519a3a0282' (as uid:107, gid:107): No such file or directory

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1516758

Title:
  synchronization problem in libvirt's remotefs volume drivers

Status in OpenStack Compute (nova):
  New

Bug description:
  Remotefs drivers have to mount a filesystem while connecting a new
  volume and unmount it eventually. They do it with code like:

  
  connect_volume:
  if not is_mounted():
      do_mount()

  
  disconnect_volume:
      try:
          umount()
      except:
          if error is not 'fs is busy':
              raise

  
  There is a race here - someone can umount fs between "if not is_mounted():" and  "do_mount()".
  I think there should be sort of reference counting, so that disconnect_volume will not unmount fs, is some instances use it.

  
  The simple testcase:

  1. Configure cinder to use nfs driver

  2. Create 2 volume from an image
  cinder create --image <image_id> 4
  cinder create --image <image_id> 4

  3. boot 2 instances from these volumes
  nova boot inst1 --flavor m1.vz --block-device id=<vol1 id>,source=volume,dest=volume,bootindex=0
  nova boot inst2 --flavor m1.vz --block-device id=<vol2 id>,source=volume,dest=volume,bootindex=0

  4. Suspend first instance
  nova suspend inst1

  5. delete second instance
  nova delete inst2

  6. resume first instance
  nova resume inst1

  The error should appear

  ] Setting instance vm_state to ERROR
  ] Traceback (most recent call last):
  ]   File "/opt/stack/nova/nova/compute/manager.py", line 6374, in _error_out_instance_on_exception
  ]     yield
  ]   File "/opt/stack/nova/nova/compute/manager.py", line 4146, in resume_instance
  ]     block_device_info)
  ]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2386, in resume
  ]     vifs_already_plugged=True)
  ]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4577, in _create_domain_and_network
  ]     xml, pause=pause, power_on=power_on)
  ]   File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4507, in _create_domain
  ]     guest.launch(pause=pause)
  ]   File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 141, in launch
  ]     self._encoded_xml, errors='ignore')
  ]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 197, in __exit__
  ]     six.reraise(self.type_, self.value, self.tb)
  ]   File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 136, in launch
  ]     return self._domain.createWithFlags(flags)
  ]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
  ]     result = proxy_call(self._autowrap, f, *args, **kwargs)
  ]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
  ]     rv = execute(f, *args, **kwargs)
  ]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
  ]     six.reraise(c, e, tb)
  ]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
  ]     rv = meth(*args, **kwargs)
  ]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1000, in createWithFlags
  ]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
  ] libvirtError: Cannot access storage file '/opt/stack/data/nova/mnt/9f23aa85a377c87a8ad6b6462e329905/volume-97bfb953-5bc3-4dbe-b267-c9519a3a0282' (as uid:107, gid:107): No such file or directory

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1516758/+subscriptions