yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #64998
[Bug 1698383] [NEW] Resource tracker regressed reporting negative memory
Public bug reported:
Nova's resource tracker is expected to publish negative values to the
scheduler when resources are overcommitted. Nova's scheduler expects
this:
https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215
In change https://review.openstack.org/#/c/306670, these values were
filtered to never drop below zero, which is incorrect. That change was
making a complex alteration for ironic and cells, specifically to avoid
resources from ironic nodes showing up as negative when they were
unavailable. That was a cosmetic fix (which I believe has been corrected
for ironic only in this patch:
https://review.openstack.org/#/c/230487/
Regardless, since the scheduler does the same calculation to determine
available resources on the node, if the node reports 0 when the
scheduler calculates -100 for a given resource, the scheduler will
assume the node till has room (due to oversubscription) and will send
builds there destined to fail.
** Affects: nova
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1698383
Title:
Resource tracker regressed reporting negative memory
Status in OpenStack Compute (nova):
New
Bug description:
Nova's resource tracker is expected to publish negative values to the
scheduler when resources are overcommitted. Nova's scheduler expects
this:
https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215
In change https://review.openstack.org/#/c/306670, these values were
filtered to never drop below zero, which is incorrect. That change was
making a complex alteration for ironic and cells, specifically to
avoid resources from ironic nodes showing up as negative when they
were unavailable. That was a cosmetic fix (which I believe has been
corrected for ironic only in this patch:
https://review.openstack.org/#/c/230487/
Regardless, since the scheduler does the same calculation to determine
available resources on the node, if the node reports 0 when the
scheduler calculates -100 for a given resource, the scheduler will
assume the node till has room (due to oversubscription) and will send
builds there destined to fail.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1698383/+subscriptions
Follow ups