← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1793533] Re: Scheduler doesn't filter out deleted compute node records based on placement RP UUIDs

 

Reviewed:  https://review.openstack.org/604108
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=37f3444c32ccb72076a1a6549c183f40c33fe684
Submitter: Zuul
Branch:    master

commit 37f3444c32ccb72076a1a6549c183f40c33fe684
Author: Dan Smith <dansmith@xxxxxxxxxx>
Date:   Thu Sep 20 07:15:25 2018 -0700

    Filter deleted computes from get_all_by_uuids()
    
    Fix ComputeNodeList.get_all_by_uuids() to use model_query() so that
    deleted compute nodes are filtered from the results. Without this,
    a stale result from placement could cause us to choose a compute
    node as a scheduling destination that has since been deleted.
    
    Change-Id: I811e84af46d678c3fdbf94ee400eabe659fc3d4e
    Closes-Bug: #1793533


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1793533

Title:
  Scheduler doesn't filter out deleted compute node records based on
  placement RP UUIDs

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Triaged
Status in OpenStack Compute (nova) pike series:
  Triaged
Status in OpenStack Compute (nova) queens series:
  Triaged
Status in OpenStack Compute (nova) rocky series:
  Triaged

Bug description:
  If you are taking a nova-compute service out of service permanently,
  the logical steps would be:

  1) Take down the service
  2) Delete it from the service list (nova service-delete <uuid>)

  However, this does not delete the compute node record which stays
  forever, leading to the scheduler to always complain about it as well:

  2018-09-20 13:15:45.312 131035 WARNING nova.scheduler.host_manager
  [req-c4a7c383-c606-48a7-b870-cc143710114a
  234412d3482f4707877ca696e105bf5b acb15d2ffaae4eda98580c7b874d7f89 -
  default default] No compute service record found for host
  <snip>.vexxhost.net

  https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L716-L720

  We should be deleting the compute node if a nova-compute binary is
  deleted, or that section should automatically clean up while warning
  (because service records can be rebuilt anyways?)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1793533/+subscriptions


References