yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #93441
[Bug 2052718] Re: Nova Compute Service status goes up and down abnormally
I dont belive this is in the scope of nova to fix.
the requirement to have consistent time synchronisation is well know and
it strongly feels like a problem that should be address in an
installation too not in code.
we mention that the controllers should be rujing shared service like ntp in the docs
https://docs.openstack.org/nova/latest/install/overview.html#controller
if you have not ensured your clocks are in sync as part of the installation process via ntp, ptp or another method then i would not consider OpenStack to be correctly installed.
** Changed in: nova
Status: New => Opinion
** Changed in: nova
Importance: Undecided => Wishlist
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2052718
Title:
Compute service status still up with nagative elapsed time
Status in OpenStack Compute (nova):
Opinion
Bug description:
Hi community,
When you type:
$ openstack nova compute service list
The status you will see "up" status but actually it is running wrong
logic because elapsed time is a negative number. This is caused by the
abs(elapsed) function turning it into a positive integer.
Around the abs(elapsed) line of code ->
https://github.com/openstack/nova/blob/stable/2023.2/nova/servicegroup/drivers/db.py
...
...
def is_up(self, service_ref):
...
...
# Timestamps in DB are UTC.
elapsed = timeutils.delta_seconds(last_heartbeat, timeutils.utcnow())
is_up = abs(elapsed) <= self.service_down_time
if not is_up:
LOG.debug('Seems service %(binary)s on host %(host)s is down. '
'Last heartbeat was %(lhb)s. Elapsed time is %(el)s',
{'binary': service_ref.get('binary'),
'host': service_ref.get('host'),
'lhb': str(last_heartbeat), 'el': str(elapsed)})
return is_up
...
...
service_down_time (threshold): 60s
https://github.com/openstack/nova/blob/stable/2023.2/nova/conf/service.py#L40
=========================== Bad result ===========================
Example (1) bug:
last_heartbeat: 10:00:00 AM
now: 9:09:30 AM
elapsed: -30(s)
abs(-30s) < 60s
===> result: up
Example (2) bug:
last_heartbeat: 10:01:00 AM
now: 9:09:58 AM
elapsed: -62(s)
abs(-30s) < 60s
===> result: down
=========================== Expected result
===========================
Example (1) good expectations:
last_heartbeat: 10:00:00 AM
now: 9:09:30 AM
elapsed: -30(s) < 0
===> result: logging error and down
Example (2) good expectations:
last_heartbeat: 10:01:00 AM
now: 9:09:58 AM
elapsed: -62(s) < 0
===> result: logging error and down
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2052718/+subscriptions
References