← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1253599] Re: Host manager uses a different value for free disk than compute manager

 

** Changed in: nova
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1253599

Title:
  Host manager uses a different value for free disk than compute manager

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  There are two metrics in the system which describe how much disc space is
  available on a compute host (both stored in compute_nodes):

  free_gb is calculated from the maximum available space in the filesystem
  minus the amount of disc space defined by the instance type of each instance
  on the host.

  disk_available_least is calculated from the actual free space in the
  filesystem minus the disk space that is commited but not yet used by
  all instances that the hypervisor knows about (so if an instance has a
  10GB disc, and is currently using 2GB an additional 8GB will be taken
  away from the actual free space.

  Under normal conditions disk_available_least should therefore always
  be less than free_gb (since it takes into account space in the
  filesystem that is consumed by things other than disks).

  However where an instance exists in the DB but not on the host, which
  can happen for some Error conditions, then free_gb may be less that
  disk_available_least (since the instance which is only in the DB is
  not factored into disk_available_least)

  Currently the scheduler (host manager) builds its view of the amount
  of free disk space from disk_least_available (if defined) using
  free_disk_gb only as a fallback if disk_least_available is None.

  https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L158

  The compute manager resource tracker on the other hand always uses
  free_diks_gb when deciding if an instance fits or not.

  https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L387

  In the case where disk_least_available > free_disk_gb this leads to
  the scheduler sending requests to hosts which will then be rejected.

  Clearly using two different metrics in this way is not healthy.

  At a minimum the scheduler should use the minimum of the two values
  (since the "missing" VM may come back its not safe to just ignore it).

  Would probably be better if the compute manager also did the same
  thing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1253599/+subscriptions