← Back to team overview

yahoo-eng-team team mailing list archive

[Bug 1420848] Re: "down" nova-compute service spuriously marked as "up" when disabled/enabled

 

** Changed in: nova
       Status: Fix Committed => Fix Released

** Changed in: nova
    Milestone: None => liberty-1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1420848

Title:
  "down" nova-compute service spuriously marked as "up" when
  disabled/enabled

Status in OpenStack Compute (Nova):
  Fix Released

Bug description:
  I think our usage of the "updated_at" field to determine whether a
  service is "up" or not is buggy.  Consider this scenario:

  1) nova-compute is happily running and is up/enabled on compute-0
  2) something causes nova-compute to stop (process crash, hardware fault, power failure, network isolation, etc.)
  3) a minute later, the nova-compute service is reported as "down"
  4) I run "nova service-disable compute-0 nova-compute", then "nova service-enable compute-0 nova-compute"
  5) nova-compute is now reported as "up" for the next minute, and the scheduler might try to assign stuff to it.  Since it's not actually available, these requests could be delayed by the RPC timeout period.

  I wonder if it would make sense to have a separate "last status
  timestamp" database field that would only get updated when we get a
  service status update and not when we change any other fields.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1420848/+subscriptions


References