group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #11530
[Bug 1641875] Re: Scheduled events (e.g. LDAP connection retries) are affected by clock adjustments
This bug was fixed in the package sssd - 1.13.4-3ubuntu0.1
---------------
sssd (1.13.4-3ubuntu0.1) yakkety; urgency=medium
* d/p/restart_providers_on_timeshift.patch: Implement watchdog and
use SIGUSR2 after watchdog detects time shift to execute pending
scheduled tasks that could be stuck (LP: #1641875)
-- Victor Tapia <victor.tapia@xxxxxxxxxxxxx> Thu, 23 Feb 2017 11:20:43
+0100
** Changed in: sssd (Ubuntu Yakkety)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1641875
Title:
Scheduled events (e.g. LDAP connection retries) are affected by clock
adjustments
Status in sssd package in Ubuntu:
Fix Released
Status in sssd source package in Trusty:
Fix Committed
Status in sssd source package in Xenial:
Fix Released
Status in sssd source package in Yakkety:
Fix Released
Status in sssd source package in Zesty:
Fix Released
Bug description:
[Impact]
* SSSD is affected by clock shifts because it's built on libtevent, that doesn't use monotonic clocks.
* After an event has been scheduled, if the time drifts to the past SSSD won't recover and the event will have to wait the shifted time to be executed.
[Test Case]
* Modify the /etc/hosts file to force a failed resolution and restart sssd.
* Within the first ~1 minute retry window, change the date (date --set) to a previous time (e.g. 2h).
* "Tail" the log file. The event will be executed at the original schedule, past those 2h.
[Regression Potential]
* None
[Other Info]
* Version 1.15 implements a watchdog that detects time shifts and resets itselft but doesn't reschedule the scheduled events.
* There's a network monitor based on netlink that detects interface changes and restarts the providers, rescheduling the scheduled events.
* Such restart can be triggered with SIGUSR2. Sending that signal after the watchdog restarts fixes this issue.
* Upstream bug: https://fedorahosted.org/sssd/ticket/3285
[Original Description]
When SSSD fails to connect to a provider (LDAP, for instance) it
creates a timed event with tevent_add_timer() in order to retry in ~1
min. Tevent relies on CLOCK_REALTIME, using absolute epoch time, so
when the time changes (e.g. NTP sync) the scheduled event is affected.
Reproducer:
1. Modify the /etc/hosts file to force a failed resolution and restart
sssd
(Tue Oct 25 09:37:14 2016) [sssd[be[openstacklocal]]] [resolv_gethostbyname_dns_query] (0x0100): Trying to resolve A record of 'ldap' in DNS
(Tue Oct 25 09:37:14 2016) [sssd[be[openstacklocal]]] [resolv_gethostbyname_done] (0x0040): querying hosts database failed [5]: Input/output error
(Tue Oct 25 09:37:14 2016) [sssd[be[openstacklocal]]] [fo_resolve_service_done] (0x0020): Failed to resolve server 'ldap': Could not contact DNS servers
...
(Tue Oct 25 09:37:14 2016) [sssd[be[openstacklocal]]] [check_online_callback] (0x0100): Backend returned: (1, 0, <NULL>) [Provider is Offline]
2. Within the ~1 minute window, change the date (date --set) to a
previous time (2 hour in my example). Note: if /etc/resolv.conf or a
network interface is modified, SSSD providers will restart, but the
scheduled retry will be kept.
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [recreate_ares_channel] (0x0100): Destroying the old c-ares channel
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [set_server_common_status] (0x0100): Marking server 'ldap' as 'name not resolved'
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [fo_set_port_status] (0x0100): Marking port 389 of server 'ldap' as 'neutral'
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'LDAP'
...
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [fo_resolve_service_done] (0x0020): Failed to resolve server 'ldap': Could not contact DNS servers
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [set_server_common_status] (0x0100): Marking server 'ldap' as 'not working'
...
(Tue Oct 25 07:41:46 2016) [sssd[be[openstacklocal]]] [check_online_callback] (0x0100): Backend returned: (1, 0, <NULL>) [Provider is Offline]
3. Once it gets to the programmed date (2 hours later), it retries:
(Tue Oct 25 09:38:25 2016) [sssd[be[openstacklocal]]] [set_server_common_status] (0x0100): Marking server 'ldap' as 'name not resolved'
(Tue Oct 25 09:38:25 2016) [sssd[be[openstacklocal]]] [fo_set_port_status] (0x0100): Marking port 389 of server 'ldap' as 'neutral'
(Tue Oct 25 09:38:25 2016) [sssd[be[openstacklocal]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'LDAP'
...
(Tue Oct 25 09:38:25 2016) [sssd[be[openstacklocal]]] [fo_resolve_service_done] (0x0020): Failed to resolve server 'ldap': Could not contact DNS servers
(Tue Oct 25 09:38:25 2016) [sssd[be[openstacklocal]]] [set_server_common_status] (0x0100): Marking server 'ldap' as 'not working'
...
(Tue Oct 25 09:38:25 2016) [sssd[be[openstacklocal]]] [check_online_callback] (0x0100): Backend returned: (1, 0, <NULL>) [Provider is Offline]
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1641875/+subscriptions