yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #70593
[Bug 1510234] Re: Heartbeats stop when time is changed
** Also affects: nova/pike
Importance: Undecided
Status: New
** Changed in: nova
Assignee: Stephen Finucane (stephenfinucane) => Roman Podoliaka (rpodolyaka)
** Changed in: nova/pike
Status: New => In Progress
** Changed in: nova/pike
Assignee: (unassigned) => John Smith (wang-zengzhi)
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1510234
Title:
Heartbeats stop when time is changed
Status in masakari:
Fix Released
Status in OpenStack Compute (nova):
Fix Released
Status in OpenStack Compute (nova) pike series:
In Progress
Status in oslo.service:
Fix Released
Bug description:
Heartbeats stop working when you mess with the system time. If a
monotonic clock were used, they would continue to work when the system
time was changed.
Steps to reproduce:
1. List the nova services ('nova-manage service list'). Note that the
'State' for each services is a happy face ':-)'.
2. Move the time ahead (for example 2 hours in the future), and then
list the nova services again. Note that heartbeats continue to work
and use the future time (see 'Updated_At').
3. Revert back to the actual time, and list the nova services again.
Note that all heartbeats stop, and have a 'State' of 'XXX'.
4. The heartbeats will start again in 2 hours when the actual time
catches up to the future time, or if you restart the services.
5. You'll see a log message like the following when the heartbeats
stop:
2015-10-26 17:14:10.538 DEBUG nova.servicegroup.drivers.db [req-
c41a2ad7-e5a5-4914-bdc8-6c1ca8b224c6 None None] Seems service is down.
Last heartbeat was 2015-10-26 17:20:20. Elapsed time is -369.461679
from (pid=13994) is_up
/opt/stack/nova/nova/servicegroup/drivers/db.py:80
Here's example output demonstrating the issue:
http://paste.openstack.org/show/477404/
See bug #1450438 for more context:
https://bugs.launchpad.net/oslo.service/+bug/1450438
Long story short: looping call is using the built-in time rather than
a monotonic clock for sleeps.
https://github.com/openstack/oslo.service/blob/3d79348dae4d36bcaf4e525153abf74ad4bd182a/oslo_service/loopingcall.py#L122
Oslo Service: version 0.11
Nova: master (commit 2c3f9c339cae24576fefb66a91995d6612bb4ab2)
To manage notifications about this bug go to:
https://bugs.launchpad.net/masakari/+bug/1510234/+subscriptions
References