yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #73540
[Bug 1683858] Re: Allocation records do not contain overhead information
Gonna kills this one. We seem to have reached the consensus that
overhead that an operator may manage however they like, it is not
something we will generically manage.
In the future it might make sense for the virt drivers to handle
overhead via resereved when they are working with update_provider_tree.
** Changed in: nova
Status: Triaged => Won't Fix
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1683858
Title:
Allocation records do not contain overhead information
Status in OpenStack Compute (nova):
Won't Fix
Bug description:
Some virt drivers report additional overhead per instance for memory
and disk usage on a compute node. That is not reported in the
allocations records for a given instance on a resource provider
(compute node), however:
https://github.com/openstack/nova/blob/15.0.0/nova/scheduler/client/report.py#L157
It is used as part of the claim test on the compute when creating an
instance or moving an instance. For creating an instance, that's done
here:
https://github.com/openstack/nova/blob/15.0.0/nova/compute/resource_tracker.py#L144-L156
https://github.com/openstack/nova/blob/15.0.0/nova/compute/claims.py#L165
Where Claim.memory_mb is the instance.flavor.memory_mb + overhead:
https://github.com/openstack/nova/blob/15.0.0/nova/compute/claims.py#L106
So ultimately what we claim on the compute node is not what we report
to placement for allocations for that instance. This matters because
when the filter scheduler is asking placement for a list of resource
providers that can fit a given request memory_mb and disk_gb it relies
on the inventory for the compute node resource provider and the
existing usage (allocations) for that provider, and we aren't
reporting the full story to placement.
This could lead to placement telling the filter scheduler there is
room to place an instance on a given compute node when in fact that
could fail the claim once we get to the host, which would results in a
retry of the build on another host (which can be expensive).
Also, when we start having multi-cell support with a top-level
conductor that the computes can't reach, we won't have build retries
anymore, so you'd just fail the claim and the build would be done and
the instance would go to ERROR state. So it's critical that the
placement service has the proper information for making the correct
decision on the first try.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1683858/+subscriptions
References