← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1503453] Re: unavailable ironic nodes being scheduled to

 

Reviewed:  https://review.openstack.org/306670
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=016b810f675b20e8ce78f4c82dc9c679c0162b7a
Submitter: Jenkins
Branch:    master

commit 016b810f675b20e8ce78f4c82dc9c679c0162b7a
Author: Jesse J. Cook <jesse.j.cook@xxxxxxxxxxxxxx>
Date:   Sat Apr 16 00:35:34 2016 +0000

    Unavailable hosts have no resources for use
    
    If a host's:
    
      * resources are unavailable
      * in a unusable state
    
    the system should:
    
      * report 0 resources
      * show 0 resources
      * not be scheduled to
    
    Change-Id: Ia1c2f6f161dde1e23acce85a54566d07805d13df
    Closes-Bug: 1503453


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1503453

Title:
  unavailable ironic nodes being scheduled to

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  When the compute resource tracker checks nodes, the ironic driver
  checks the node against a list of states that it should return
  resources for. This is to prevent nodes in various ironic states, like
  our cleaning process, that are not available from being scheduled to
  by nova.

  The logic around this check (
  https://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L334-L351
  ) looks for existing instances on the node, and if they aren't found
  it then looks at the conditions for returning the node as unavailable.

  The problem is when you have an orphaned instance on your node, one
  which ironic sees as present but nova does not (usually nova lists it
  as having been deleted).

  The instance detection will return true, causing the memory_mb_used
  and memory_mb values to be set to the retrieved value from
  instance_info['memory_mb'].

  The check for _node_resources_unavailable will not run as it is an
  elif. This means that even if this node is in maintenance state, we
  won't notice and return all zeros for resources as we normally would.

  Once the resource tracker calls _update_usage_from_instance, it will
  not find an instance associated with the node from nova's point of
  view and will return all of the memory as available instead, causing
  builds to be scheduled to this node.

  Ironic will then fail the build attempt due to it showing an instance
  already associated with the node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1503453/+subscriptions


References