← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1532562] Re: Cell capacities updates include available resources of compute nodes "down"

 

Reviewed:  https://review.openstack.org/265651
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2d03bb97a309341c5a2bcc978220cd5af5f32179
Submitter: Jenkins
Branch:    master

commit 2d03bb97a309341c5a2bcc978220cd5af5f32179
Author: Belmiro Moreira <moreira.belmiro.email.lists@xxxxxxxxx>
Date:   Sun Jan 10 16:51:06 2016 +0100

    Fix cell capacity when compute nodes are down
    
    Available resources from compute nodes that are not sending
    service heartbeats (not alive) should not be considered in cell
    capacity updates.
    
    Closes Bug: #1532562
    
    Change-Id: I0a456053d9c5e5fba39eb92f4820003e86d7a205


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1532562

Title:
  Cell capacities updates include available resources of compute nodes
  "down"

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  If a child cell has compute nodes without a heartbeat update 
  but enabled (XXX state with "nova-manage service list") the child cell continues to consider the available resources of these compute nodes 
  when updating the cell capacity.
  This can be problematic when having several cells and trying to fill them completely.
  Requests are sent to the cell that can fit more instances of the requested type however when compute nodes are "down" the requests will fail with "No valid host" in the cell.

  When updating the cell capacity the "disabled" compute nodes are
  excluded. This should also happen if the compute node didn't have a
  heartbeat update during the "CONF.service_down_time".

  How to reproduce:
  1) Have a cell environment with 2 child cells (A and B).
  2) Have nova-cells running in "debug". Confirm that the "Received capacities from child cell" A and B (in top nova-cell log) matches the number of available resources.
  4) Stop some compute nodes in cell A.
  5) Confirm that the "Received capacities from child cell A" don't change.
  6) Cell scheduler can send requests to cell A that can fail with "No valid host".

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1532562/+subscriptions


References