yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #65014
[Bug 1698383] Re: Resource tracker regressed reporting negative memory
*** This bug is a duplicate of bug 1635367 ***
https://bugs.launchpad.net/bugs/1635367
Reviewed: https://review.openstack.org/474994
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0ddf3ce01149d78ee0cf8f7497f8a9074c6f167d
Submitter: Jenkins
Branch: master
commit 0ddf3ce01149d78ee0cf8f7497f8a9074c6f167d
Author: Dan Smith <dansmith@xxxxxxxxxx>
Date: Fri Jun 16 07:25:40 2017 -0700
Fix regression preventing reporting negative resources for overcommit
In Nova prior to Ocata, the scheduler computes available resources for
a compute node, attempting to mirror the same calculation that happens
locally. It does this to determine if a new instance should fit on the
node. If overcommit is being used, some of these numbers can be negative.
In change 016b810f675b20e8ce78f4c82dc9c679c0162b7a we changed the
compute side to never report negative resources, which was an ironic-
specific fix for nodes that are offline. That, however, has been
corrected for ironic nodes in 047da6498dbb3af71bcb9e6d0e2c38aa23b06615.
Since the base change to the resource tracker has caused the scheduler
and compute to do different math, we need to revert it to avoid the
scheduler sending instances to nodes where it believes -NNN is the
lower limit (with overcommit), but the node is reporting zero.
This doesn't actually affect Ocata because of our use of the placement
engine. However, this code is still in master and needs to be backported.
This part of the change actually didn't even have a unit test, so
this patch adds one to validate that the resource tracker will
calculate and report negative resources.
Change-Id: I25ba6f7f4e4fab6db223368427d889d6b06a77e8
Closes-Bug: #1698383
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1698383
Title:
Resource tracker regressed reporting negative memory
Status in OpenStack Compute (nova):
Fix Released
Bug description:
Nova's resource tracker is expected to publish negative values to the
scheduler when resources are overcommitted. Nova's scheduler expects
this:
https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215
In change https://review.openstack.org/#/c/306670, these values were
filtered to never drop below zero, which is incorrect. That change was
making a complex alteration for ironic and cells, specifically to
avoid resources from ironic nodes showing up as negative when they
were unavailable. That was a cosmetic fix (which I believe has been
corrected for ironic only in this patch:
https://review.openstack.org/#/c/230487/
Regardless, since the scheduler does the same calculation to determine
available resources on the node, if the node reports 0 when the
scheduler calculates -100 for a given resource, the scheduler will
assume the node till has room (due to oversubscription) and will send
builds there destined to fail.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1698383/+subscriptions
References