← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1528743] [NEW] HostState in Scheduler can be incorrect

 

Public bug reported:

In nova-scheduler, we now uses scheduler/host_manager/update_from_compute_node() to update information about a
host from a ComputeNode object. At the beginning of this function, we have a few lines of code:

https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L162-L164

if (self.updated and compute.updated_at
    and self.updated > compute.updated_at):
return

here we will not update the information if the local update time is later than compute update time.
This is generally correct, since the compute have a periodic task to update the information:
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L6243

but it only updates if the resource have changed:
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L659

This can lead to inconsistency if the scheduler have consumed(updated) the information but
then the compute fail to claim (the periodic task won't update because there are no changes).

We can add an time limit as a config to the above mentioned "if" logic, so that if the difference
between current time and self.updated time is larger than the limit, we will also update the
information from ComputeNode object, and avoid the inconsistency between different services.

** Affects: nova
     Importance: Undecided
     Assignee: Zhenyu Zheng (zhengzhenyu)
         Status: New

** Changed in: nova
     Assignee: (unassigned) => Zhenyu Zheng (zhengzhenyu)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1528743

Title:
  HostState in Scheduler can be incorrect

Status in OpenStack Compute (nova):
  New

Bug description:
  In nova-scheduler, we now uses scheduler/host_manager/update_from_compute_node() to update information about a
  host from a ComputeNode object. At the beginning of this function, we have a few lines of code:

  https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py#L162-L164

  if (self.updated and compute.updated_at
      and self.updated > compute.updated_at):
  return

  here we will not update the information if the local update time is later than compute update time.
  This is generally correct, since the compute have a periodic task to update the information:
  https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L6243

  but it only updates if the resource have changed:
  https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L659

  This can lead to inconsistency if the scheduler have consumed(updated) the information but
  then the compute fail to claim (the periodic task won't update because there are no changes).

  We can add an time limit as a config to the above mentioned "if" logic, so that if the difference
  between current time and self.updated time is larger than the limit, we will also update the
  information from ComputeNode object, and avoid the inconsistency between different services.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1528743/+subscriptions