touch-packages team mailing list archive
-
touch-packages team
-
Mailing list archive
-
Message #52275
[Bug 1188642] Re: Init aborts with the message: init: log.c:813: Assertion failed in log_clear_unflushed: ! log->io
Encounter same issue in 14.04
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to upstart in Ubuntu.
https://bugs.launchpad.net/bugs/1188642
Title:
Init aborts with the message: init: log.c:813: Assertion failed in
log_clear_unflushed: ! log->io
Status in Upstart:
Fix Released
Status in upstart package in Ubuntu:
Fix Released
Bug description:
init (in Ubuntu 13.04) accidentally aborts with the following messages
in the kernel log:
[ 4.981969] init: log.c:813: Assertion failed in log_clear_unflushed: ! log->io
[ 4.991031] init: Caught abort, core dumped
[ 4.995571] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000600
I've figured out that the probable cause is a race between the line 699 of the file init/log.c of the function log_read_watch() that sets the field
log->remote_closed to 1 (the caller clears log->io) and the routine log_clear_unflushed() that causes the abort when log->remote_closed == 1 but log->io != NULL: log_clear_unflushed() may preempt log_read_watch() when
log->remote_closed is set to 1 but before log->io is cleared for
the same log.
I applied the following patch to prove this assertion:
@@ -264,6 +266,8 @@
*/
log->io->error_handler = NULL;
+ nih_fatal("log %p: clearing log->io\n", log);
+
nih_free (log->io);
log->io = NULL;
}
@@ -384,6 +388,8 @@
nih_free (err);
+ nih_fatal("log %p: clearing log->io\n", log);
+
/* Ensure the NihIo is closed */
nih_free (log->io);
log->io = NULL;
@@ -696,7 +702,11 @@
*
*/
if (saved && saved != EAGAIN && saved != EWOULDBLOCK)
+ {
+ nih_fatal("log %p: setting log->remote_closed\n", log);
log->remote_closed = 1;
+ }
close (log->fd);
log->fd = -1;
@@ -806,6 +816,8 @@
nih_assert (log->open_errno);
if (log->remote_closed) {
+ nih_fatal("log %p: checking log->io\n", log);
+
/* Parent job has ended and unflushed data
* exists.
*/
I got the following kernel log before the abort:
[ 4.635146] init: log 0xb882a660: clearing log->io
[ 4.640487] init: log 0xb88035e0: setting log->remote_closed
[ 4.654816] init: log 0xb882ce88: clearing log->io
[ 4.655263] init: log 0xb882a1d0: setting log->remote_closed
[ 4.706758] init: log 0xb8831370: setting log->remote_closed
[ 6.714543] init: log 0xb882a600: clearing log->io
[ 6.733596] init: log 0xb882c6a0: clearing log->io
[ 6.746124] init: log 0xb882cbd8: setting log->remote_closed
[ 6.765075] init: log 0xb882b1d0: clearing log->io
[ 6.787772] init: log 0xb8842b50: clearing log->io
[ 6.801163] init: log 0xb882cb78: clearing log->io
[ 6.810717] init: log 0xb882df58: clearing log->io
[ 6.864869] init: log 0xb883e418: clearing log->io
[ 6.871033] init: log 0xb882d2a8: clearing log->io
[ 6.899113] init: log 0xb882d268: clearing log->io
[ 6.927477] init: log 0xb884eb28: clearing log->io
[ 6.942432] init: log 0xb88519a8: clearing log->io
[ 6.997231] init: log 0xb8850c40: clearing log->io
[ 7.031685] init: log 0xb8850650: clearing log->io
[ 7.042562] init: log 0xb8856d10: clearing log->io
[ 7.042811] init: log 0xb8857cf8: clearing log->io
[ 7.063461] init: log 0xb885a850: setting log->remote_closed
[ 7.063996] init: log 0xb883e478: setting log->remote_closed
[ 7.095505] init: log 0xb8854960: setting log->remote_closed
[ 7.135581] init: log 0xb883e478: checking log->io
[ 7.135756] init: log.c:825: Assertion failed in log_clear_unflushed: ! log->io
[ 7.140537] init: Caught abort, core dumped
[ 7.143211] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000600
So the assertion has been proved for the log at 0xb883e478 which has
the field remote_closed updated but the field io dirty.
To manage notifications about this bug go to:
https://bugs.launchpad.net/upstart/+bug/1188642/+subscriptions