← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1579213] [NEW] ComputeFilter fails because compute node has not been heard from in a while

 

Public bug reported:

Description
===========

When scheduling an instance with Nova and Ironic, some hypervisors are
ignored by ComputeFilter because they "has not been heard from in a
while".

Expected result
===============

I expect all hypervisors to be available to nova-scheduler.

Actual result
=============

Some hypervisors are ignored due to the service being "down".

I found that:
* ComputeFilter is ignoring hypervisors if the "nova.compute_nodes.updated_at" field is outdated according to the "service_down_time" config.
* When starting nova-compute service, the field is updated correctly.
* Next resource usage updates do not update the field until the service is restarted.
* Resource tracker does not update scheduler state (and field) if no change is found for the hypervisor. [1] Commenting out those lines makes nova-compute update the updated_at field correctly and nova-scheduler is happy.

This makes nova-scheduler sad and not all hypervisors are available
during scheduling.

Environment
===========

Nova 2015.1.2

[1]
https://github.com/openstack/nova/blob/d619ad6ba15df1cf7dc92ddf84d1c65af018682f/nova/compute/resource_tracker.py#L632-L633

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1579213

Title:
  ComputeFilter fails because compute node has not been heard from in a
  while

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  When scheduling an instance with Nova and Ironic, some hypervisors are
  ignored by ComputeFilter because they "has not been heard from in a
  while".

  Expected result
  ===============

  I expect all hypervisors to be available to nova-scheduler.

  Actual result
  =============

  Some hypervisors are ignored due to the service being "down".

  I found that:
  * ComputeFilter is ignoring hypervisors if the "nova.compute_nodes.updated_at" field is outdated according to the "service_down_time" config.
  * When starting nova-compute service, the field is updated correctly.
  * Next resource usage updates do not update the field until the service is restarted.
  * Resource tracker does not update scheduler state (and field) if no change is found for the hypervisor. [1] Commenting out those lines makes nova-compute update the updated_at field correctly and nova-scheduler is happy.

  This makes nova-scheduler sad and not all hypervisors are available
  during scheduling.

  Environment
  ===========

  Nova 2015.1.2

  [1]
  https://github.com/openstack/nova/blob/d619ad6ba15df1cf7dc92ddf84d1c65af018682f/nova/compute/resource_tracker.py#L632-L633

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1579213/+subscriptions


Follow ups