← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1577642] [NEW] race between disk_available_least and instance operations

 

Public bug reported:

The calculation for LibvirtDriver._get_disk_over_committed_size_total()
loops over all the instances on the hypervisor to try to figure out the
total overcommitted size for all instances.

However, at the time that routine is called from
ResourceTracker.update_available_resource()  we do not hold
COMPUTE_RESOURCE_SEMAPHORE.  This means that instance claims can be
modified (due to instance creation/deletion/resize/migration/etc),
potentially causing the calculated value for
data['disk_available_least'] to not actually reflect current reality,
and potentially allowing different eventlets to have different views of
data['disk_available_least'].

There was a related bug reported some time back
(https://bugs.launchpad.net/nova/+bug/968339) but rather than deal with
the underlying race condition they just sort of papered over it by
ignoring the InstanceNotFound exception.

** Affects: nova
     Importance: Undecided
         Status: New


** Tags: compute race-condition

** Description changed:

  The calculation for LibvirtDriver._get_disk_over_committed_size_total()
  loops over all the instances on the hypervisor to try to figure out the
  total overcommitted size for all instances.
  
  However, at the time that routine is called from
  ResourceTracker.update_available_resource()  we do not hold
  COMPUTE_RESOURCE_SEMAPHORE.  This means that instances can be
  created/destroyed/resized, causing the calculated value for
  data['disk_available_least'] to not actually reflect current reality.
+ 
+ There was a related bug reported some time back
+ (https://bugs.launchpad.net/nova/+bug/968339) but rather than deal with
+ the underlying race condition they just sort of papered over it by
+ ignoring the InstanceNotFound exception.

** Description changed:

  The calculation for LibvirtDriver._get_disk_over_committed_size_total()
  loops over all the instances on the hypervisor to try to figure out the
  total overcommitted size for all instances.
  
  However, at the time that routine is called from
  ResourceTracker.update_available_resource()  we do not hold
- COMPUTE_RESOURCE_SEMAPHORE.  This means that instances can be
- created/destroyed/resized, causing the calculated value for
- data['disk_available_least'] to not actually reflect current reality.
+ COMPUTE_RESOURCE_SEMAPHORE.  This means that instance claims can be
+ modified (due to instance creation/deletion/resize/migration/etc),
+ causing the calculated value for data['disk_available_least'] to not
+ actually reflect current reality.
  
  There was a related bug reported some time back
  (https://bugs.launchpad.net/nova/+bug/968339) but rather than deal with
  the underlying race condition they just sort of papered over it by
  ignoring the InstanceNotFound exception.

** Description changed:

  The calculation for LibvirtDriver._get_disk_over_committed_size_total()
  loops over all the instances on the hypervisor to try to figure out the
  total overcommitted size for all instances.
  
  However, at the time that routine is called from
  ResourceTracker.update_available_resource()  we do not hold
  COMPUTE_RESOURCE_SEMAPHORE.  This means that instance claims can be
  modified (due to instance creation/deletion/resize/migration/etc),
- causing the calculated value for data['disk_available_least'] to not
- actually reflect current reality.
+ potentially causing the calculated value for
+ data['disk_available_least'] to not actually reflect current reality,
+ and potentially allowing different eventlets to have different views of
+ data['disk_available_least'].
  
  There was a related bug reported some time back
  (https://bugs.launchpad.net/nova/+bug/968339) but rather than deal with
  the underlying race condition they just sort of papered over it by
  ignoring the InstanceNotFound exception.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1577642

Title:
  race between disk_available_least and instance operations

Status in OpenStack Compute (nova):
  New

Bug description:
  The calculation for
  LibvirtDriver._get_disk_over_committed_size_total() loops over all the
  instances on the hypervisor to try to figure out the total
  overcommitted size for all instances.

  However, at the time that routine is called from
  ResourceTracker.update_available_resource()  we do not hold
  COMPUTE_RESOURCE_SEMAPHORE.  This means that instance claims can be
  modified (due to instance creation/deletion/resize/migration/etc),
  potentially causing the calculated value for
  data['disk_available_least'] to not actually reflect current reality,
  and potentially allowing different eventlets to have different views
  of data['disk_available_least'].

  There was a related bug reported some time back
  (https://bugs.launchpad.net/nova/+bug/968339) but rather than deal
  with the underlying race condition they just sort of papered over it
  by ignoring the InstanceNotFound exception.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1577642/+subscriptions