yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #34490
[Bug 1420848] Re: "down" nova-compute service spuriously marked as "up" when disabled/enabled
** Changed in: nova
Status: Fix Committed => Fix Released
** Changed in: nova
Milestone: None => liberty-1
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1420848
Title:
"down" nova-compute service spuriously marked as "up" when
disabled/enabled
Status in OpenStack Compute (Nova):
Fix Released
Bug description:
I think our usage of the "updated_at" field to determine whether a
service is "up" or not is buggy. Consider this scenario:
1) nova-compute is happily running and is up/enabled on compute-0
2) something causes nova-compute to stop (process crash, hardware fault, power failure, network isolation, etc.)
3) a minute later, the nova-compute service is reported as "down"
4) I run "nova service-disable compute-0 nova-compute", then "nova service-enable compute-0 nova-compute"
5) nova-compute is now reported as "up" for the next minute, and the scheduler might try to assign stuff to it. Since it's not actually available, these requests could be delayed by the RPC timeout period.
I wonder if it would make sense to have a separate "last status
timestamp" database field that would only get updated when we get a
service status update and not when we change any other fields.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1420848/+subscriptions
References