yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74822
[Bug 1793533] Re: Scheduler doesn't filter out deleted compute node records based on placement RP UUIDs
Reviewed: https://review.openstack.org/604108
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=37f3444c32ccb72076a1a6549c183f40c33fe684
Submitter: Zuul
Branch: master
commit 37f3444c32ccb72076a1a6549c183f40c33fe684
Author: Dan Smith <dansmith@xxxxxxxxxx>
Date: Thu Sep 20 07:15:25 2018 -0700
Filter deleted computes from get_all_by_uuids()
Fix ComputeNodeList.get_all_by_uuids() to use model_query() so that
deleted compute nodes are filtered from the results. Without this,
a stale result from placement could cause us to choose a compute
node as a scheduling destination that has since been deleted.
Change-Id: I811e84af46d678c3fdbf94ee400eabe659fc3d4e
Closes-Bug: #1793533
** Changed in: nova
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793533
Title:
Scheduler doesn't filter out deleted compute node records based on
placement RP UUIDs
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) ocata series:
Triaged
Status in OpenStack Compute (nova) pike series:
Triaged
Status in OpenStack Compute (nova) queens series:
Triaged
Status in OpenStack Compute (nova) rocky series:
Triaged
Bug description:
If you are taking a nova-compute service out of service permanently,
the logical steps would be:
1) Take down the service
2) Delete it from the service list (nova service-delete <uuid>)
However, this does not delete the compute node record which stays
forever, leading to the scheduler to always complain about it as well:
2018-09-20 13:15:45.312 131035 WARNING nova.scheduler.host_manager
[req-c4a7c383-c606-48a7-b870-cc143710114a
234412d3482f4707877ca696e105bf5b acb15d2ffaae4eda98580c7b874d7f89 -
default default] No compute service record found for host
<snip>.vexxhost.net
https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L716-L720
We should be deleting the compute node if a nova-compute binary is
deleted, or that section should automatically clean up while warning
(because service records can be rebuilt anyways?)
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793533/+subscriptions
References