← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1230047] Re: VMware: spawning large amounts of VMs concurrently sometimes causes "VMDK lock" error

 

** Changed in: nova
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1230047

Title:
  VMware: spawning large amounts of VMs concurrently sometimes causes
  "VMDK lock" error

Status in OpenStack Compute (Nova):
  Fix Released
Status in The OpenStack VMwareAPI subTeam:
  Confirmed

Bug description:
  When using the VMwareVCDriver, spawning large amounts of virtual
  machines concurrently causes some instances to spawn with status
  ERROR. The number of machines that fail to build is unpredictable and
  sometimes all instances do end up spawning successfully.

  The issue can be reproduced by running:

      nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32
  nameless

  The number of instances that causes the errors differ from environment
  to environment. Start with 30-40. There are two errors seen in the
  logs that are causing the instance spawn failures. The first is the
  ESX host not finding the image in the nfs datastore (even though it is
  there, otherwise other instances couldn't be spawned). The second is
  the ESX host not being able to access the vmdk image because it is
  locked.

  Image not found error:

  Traceback (most recent call last):
    File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn
      block_device_info)
    File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn
      admin_password, network_info, block_device_info)
    File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn
      vmdk_file_size_in_kb, linked_clone)
    File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm
      self._session._wait_for_task(instance_uuid, reconfig_task)
    File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task
      ret_val = done.wait()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
      return hubs.get_hub().switch()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch
      return self.greenlet.switch()
  NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found

  Image locked error:

  Traceback (most recent call last):
    File "/opt/stack/nova/nova/compute/manager.py", line 1407, in _spawn
      block_device_info)
    File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 623, in spawn
      admin_password, network_info, block_device_info)
    File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 504, in spawn
      root_gb_in_kb, linked_clone)
    File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm
      self._session._wait_for_task(instance_uuid, reconfig_task)
    File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 900, in _wait_for_task
      ret_val = done.wait()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
      return hubs.get_hub().switch()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch
      return self.greenlet.switch()
  NovaException: Unable to access file [ryan-nfs] vmware_base/f110bb94-2170-4a3a-ae0d-760f95eb8b47.0.

  Environment information:

  - 1 datacenter, 1 cluster, 7 hosts
  - NFS shared datastore
  - was able to spawn 7 instances before errors appeared
  - screen log with tracebacks: http://paste.openstack.org/show/47410/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1230047/+subscriptions