yahoo-eng-team team mailing list archive

Thread
Date

[Bug 1683858] Re: Allocation records do not contain overhead information

To: yahoo-eng-team@xxxxxxxxxxxxxxxxxxx
From: Chris Dent <cdent+lp@xxxxxxxxxxxxx>
Date: Wed, 27 Jun 2018 13:07:09 -0000
Reply-to: Bug 1683858 <1683858@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Gonna kills this one. We seem to have reached the consensus that
overhead that an operator may manage however they like, it is not
something we will generically manage.

In the future it might make sense for the virt drivers to handle
overhead via resereved when they are working with update_provider_tree.

** Changed in: nova
       Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1683858

Title:
  Allocation records do not contain overhead information

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Some virt drivers report additional overhead per instance for memory
  and disk usage on a compute node. That is not reported in the
  allocations records for a given instance on a resource provider
  (compute node), however:

  https://github.com/openstack/nova/blob/15.0.0/nova/scheduler/client/report.py#L157

  It is used as part of the claim test on the compute when creating an
  instance or moving an instance. For creating an instance, that's done
  here:

  https://github.com/openstack/nova/blob/15.0.0/nova/compute/resource_tracker.py#L144-L156

  https://github.com/openstack/nova/blob/15.0.0/nova/compute/claims.py#L165

  Where Claim.memory_mb is the instance.flavor.memory_mb + overhead:

  https://github.com/openstack/nova/blob/15.0.0/nova/compute/claims.py#L106

  So ultimately what we claim on the compute node is not what we report
  to placement for allocations for that instance. This matters because
  when the filter scheduler is asking placement for a list of resource
  providers that can fit a given request memory_mb and disk_gb it relies
  on the inventory for the compute node resource provider and the
  existing usage (allocations) for that provider, and we aren't
  reporting the full story to placement.

  This could lead to placement telling the filter scheduler there is
  room to place an instance on a given compute node when in fact that
  could fail the claim once we get to the host, which would results in a
  retry of the build on another host (which can be expensive).

  Also, when we start having multi-cell support with a top-level
  conductor that the computes can't reach, we won't have build retries
  anymore, so you'd just fail the claim and the build would be done and
  the instance would go to ERROR state. So it's critical that the
  placement service has the proper information for making the correct
  decision on the first try.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1683858/+subscriptions

References

[Bug 1683858] [NEW] Allocation records do not contain overhead information
From: Matt Riedemann, 2017-04-18