yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #74796
[Bug 1793533] Re: Deleting a service with nova-compute binary doesn't remove compute node
The related issue is that the scheduler was not filtering out deleted
compute node records when pulling them from the cell DB:
https://github.com/openstack/nova/blob/d87852ae6a1987b6faa3cb5851f9758b47ef4636/nova/objects/compute_node.py#L443
Because ^ that query doesn't filter out deleted records. Granted, if the
resource provider record in placement was cleaned up properly, we
wouldn't have gotten that far anyway, but it's still an issue.
** Changed in: nova
Status: Invalid => Triaged
** Changed in: nova
Importance: Undecided => Medium
** Summary changed:
- Deleting a service with nova-compute binary doesn't remove compute node
+ Scheduler doesn't filter out deleted compute node records based on placement RP UUIDs
** Also affects: nova/pike
Importance: Undecided
Status: New
** Also affects: nova/rocky
Importance: Undecided
Status: New
** Also affects: nova/ocata
Importance: Undecided
Status: New
** Also affects: nova/queens
Importance: Undecided
Status: New
** Changed in: nova/ocata
Status: New => Triaged
** Changed in: nova/pike
Status: New => Triaged
** Changed in: nova/queens
Status: New => Triaged
** Changed in: nova/rocky
Status: New => Incomplete
** Changed in: nova/rocky
Status: Incomplete => Triaged
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793533
Title:
Scheduler doesn't filter out deleted compute node records based on
placement RP UUIDs
Status in OpenStack Compute (nova):
Triaged
Status in OpenStack Compute (nova) ocata series:
Triaged
Status in OpenStack Compute (nova) pike series:
Triaged
Status in OpenStack Compute (nova) queens series:
Triaged
Status in OpenStack Compute (nova) rocky series:
Triaged
Bug description:
If you are taking a nova-compute service out of service permanently,
the logical steps would be:
1) Take down the service
2) Delete it from the service list (nova service-delete <uuid>)
However, this does not delete the compute node record which stays
forever, leading to the scheduler to always complain about it as well:
2018-09-20 13:15:45.312 131035 WARNING nova.scheduler.host_manager
[req-c4a7c383-c606-48a7-b870-cc143710114a
234412d3482f4707877ca696e105bf5b acb15d2ffaae4eda98580c7b874d7f89 -
default default] No compute service record found for host
<snip>.vexxhost.net
https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L716-L720
We should be deleting the compute node if a nova-compute binary is
deleted, or that section should automatically clean up while warning
(because service records can be rebuilt anyways?)
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793533/+subscriptions
References