← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1282822] [NEW] XenAPI: Race condition in wait for coalesce

 

Public bug reported:

wait for coalesce scans the SR then checks if the GC has finished.
The GC might finish between the two calls, so the state of the system is pre-GC but the GC claims not to be running.

This is a race which can cause an error when actually the state is now
correct.

The order of the scan / GC check must be switched.

2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 2455, in backup_instance
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     task_states.IMAGE_BACKUP)
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 2521, in _snapshot_instance
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     update_task_state)
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/driver.py", line 261, in snapshot
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     self._vmops.snapshot(context, instance, image_id, update_task_state)
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/vmops.py", line 750, in snapshot
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     post_snapshot_callback=update_task_state) as vdi_uuids:
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     return self.gen.next()
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/vm_utils.py", line 790, in _snapshot_attached_here_impl
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     original_parent_uuid)
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/vm_utils.py", line 2114, in _wait_for_vhd_coalesce
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     raise exception.NovaException(msg)
2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher NovaException: VHD coalesce: Garbage collection not running, giving up...

** Affects: nova
     Importance: Critical
     Assignee: Bob Ball (bob-ball)
         Status: New


** Tags: xenserver

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1282822

Title:
  XenAPI: Race condition in wait for coalesce

Status in OpenStack Compute (Nova):
  New

Bug description:
  wait for coalesce scans the SR then checks if the GC has finished.
  The GC might finish between the two calls, so the state of the system is pre-GC but the GC claims not to be running.

  This is a race which can cause an error when actually the state is now
  correct.

  The order of the scan / GC check must be switched.

  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 2455, in backup_instance
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     task_states.IMAGE_BACKUP)
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/compute/manager.py", line 2521, in _snapshot_instance
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     update_task_state)
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/driver.py", line 261, in snapshot
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     self._vmops.snapshot(context, instance, image_id, update_task_state)
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/vmops.py", line 750, in snapshot
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     post_snapshot_callback=update_task_state) as vdi_uuids:
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     return self.gen.next()
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/vm_utils.py", line 790, in _snapshot_attached_here_impl
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     original_parent_uuid)
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher   File "/opt/stack/nova/nova/virt/xenapi/vm_utils.py", line 2114, in _wait_for_vhd_coalesce
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher     raise exception.NovaException(msg)
  2014-02-20 20:26:55.336 TRACE oslo.messaging.rpc.dispatcher NovaException: VHD coalesce: Garbage collection not running, giving up...

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1282822/+subscriptions


Follow ups

References