touch-packages team mailing list archive
-
touch-packages team
-
Mailing list archive
-
Message #129772
[Bug 1535854] [NEW] watchdog daemon going in to failed state on reboot
Public bug reported:
I was testing the watchdog daemon code on the development version of
16.04 today and found the associated systemd service files for this has
some bugs.
The first of these was a typo in /lib/systemd/system/watchdog.service
where there was a missing ['] character. This lead to an error message
"Unbalanced quoting, ignoring" which was easy to fix. The update is now
in the source-forge repository for the project as commit
http://sourceforge.net/p/watchdog/code/ci/38e6430f80907a84741c760ef48df69a679b294c/
However, I have not found the reason for the second problem where the
daemon has some fault at reboot time and goes in to "failed state" and
then it will not restart with the machine booting. The typical related
syslog entries are:
Jan 19 16:46:08 ubuntu watchdog[2066]: stopping daemon (5.14)
Jan 19 16:46:08 ubuntu systemd[1]: Stopping watchdog daemon...
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Control process exited, code=exited status=1
Jan 19 16:46:09 ubuntu systemd[1]: Stopped watchdog daemon.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Unit entered failed state.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Triggering OnFailure= dependencies.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed to enqueue OnFailure= job: Resource deadlock avoided
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed with result 'exit-code'.
I am guessing this is related to the shut-down approach for the watchdog
daemon where normally the wd_keepalive daemon is started afterwards (to
prevent a reboot if the hardware module is configured for "no way out"
so the timer cannot be stopped).
$ lsb_release -rd
Description: Ubuntu Xenial Xerus (development branch)
Release: 16.04
$ apt-cache policy watchdog
watchdog:
Installed: 5.14-3
Candidate: 5.14-3
Version table:
*** 5.14-3 500
500 http://us.archive.ubuntu.com/ubuntu xenial/universe i386 Packages
100 /var/lib/dpkg/status
I have contacted the watchdog project maintainer with a view to working
out a solution to this. This bug report is more of a marker to let folks
know that there is an issue here if you plan on having high-availability
servers based on Ubuntu 16.04 (well, any systemd based system really..)
where watchdog-based fault recovery is an expectation.
** Affects: watchdog (Ubuntu)
Importance: Undecided
Status: New
** Package changed: rsyslog (Ubuntu) => watchdog (Ubuntu)
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to rsyslog in Ubuntu.
https://bugs.launchpad.net/bugs/1535854
Title:
watchdog daemon going in to failed state on reboot
Status in watchdog package in Ubuntu:
New
Bug description:
I was testing the watchdog daemon code on the development version of
16.04 today and found the associated systemd service files for this
has some bugs.
The first of these was a typo in /lib/systemd/system/watchdog.service
where there was a missing ['] character. This lead to an error message
"Unbalanced quoting, ignoring" which was easy to fix. The update is
now in the source-forge repository for the project as commit
http://sourceforge.net/p/watchdog/code/ci/38e6430f80907a84741c760ef48df69a679b294c/
However, I have not found the reason for the second problem where the
daemon has some fault at reboot time and goes in to "failed state" and
then it will not restart with the machine booting. The typical related
syslog entries are:
Jan 19 16:46:08 ubuntu watchdog[2066]: stopping daemon (5.14)
Jan 19 16:46:08 ubuntu systemd[1]: Stopping watchdog daemon...
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Control process exited, code=exited status=1
Jan 19 16:46:09 ubuntu systemd[1]: Stopped watchdog daemon.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Unit entered failed state.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Triggering OnFailure= dependencies.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed to enqueue OnFailure= job: Resource deadlock avoided
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed with result 'exit-code'.
I am guessing this is related to the shut-down approach for the
watchdog daemon where normally the wd_keepalive daemon is started
afterwards (to prevent a reboot if the hardware module is configured
for "no way out" so the timer cannot be stopped).
$ lsb_release -rd
Description: Ubuntu Xenial Xerus (development branch)
Release: 16.04
$ apt-cache policy watchdog
watchdog:
Installed: 5.14-3
Candidate: 5.14-3
Version table:
*** 5.14-3 500
500 http://us.archive.ubuntu.com/ubuntu xenial/universe i386 Packages
100 /var/lib/dpkg/status
I have contacted the watchdog project maintainer with a view to
working out a solution to this. This bug report is more of a marker to
let folks know that there is an issue here if you plan on having high-
availability servers based on Ubuntu 16.04 (well, any systemd based
system really..) where watchdog-based fault recovery is an
expectation.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/watchdog/+bug/1535854/+subscriptions