← Back to team overview

touch-packages team mailing list archive

[Bug 1447756] Re: segfault in log.c code causes phone reboot loops

 

Hi Ondrej,

Regarding #15, I'm not sure this is correct. As you say, when the job
process terminates, job_process_terminated() gets called. This calls
log_handle_unflushed() and that function calls log_read_watch(), which
ultimately calls write(2). However, even if the write is successful
before 'initctl notify-disk-writeable' gets called, if you look at
log_handle_unflushed()...

 743 log_handle_unflushed (void *parent, Log *log)
 744 {
 745         NihListEntry  *elem;
 746 
 747         nih_assert (log);
 748         nih_assert (log->detached == 0);
 749 
 750         log_read_watch (log);
 751 
 752         if (! log->unflushed->len)
 753                 return 1;

So, if the write is successful and log->unflushed->len becomes zero, the
function returns 1 (meaning "log does not need to be added to the
unflushed list") and crucially the log is not added to the unflushed
list.

Regarding #16, there shouldn't be a problem in that scenario since when
the log gets added to the unflushed list, it is totally detached from
its parent job. Hence, the job can be destroyed but the log lives on as
an element of the unflushed list. If that job gets recreated, it will
get a new set of log objects associated with it.

Did you manage to get the full log_clear_unflushed() debug output in the
end?

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to upstart in Ubuntu.
https://bugs.launchpad.net/bugs/1447756

Title:
  segfault in log.c code causes phone reboot loops

Status in the base for Ubuntu mobile products:
  Fix Committed
Status in Upstart:
  New
Status in upstart package in Ubuntu:
  Confirmed

Bug description:
  We recently started getting reprots from phone users that their
  devices go into a reboot loop after changing the language or getting
  an OTA upgrade (either of both end with a reboot of the phone)

  after a bit of research we collected the log at
  http://pastebin.ubuntu.com/10872934/

  this shows a segfault of upstarts init binary in the log.c code:

  [    6.999083]init: log.c:819: Assertion failed in log_clear_unflushed: log->unflushed->len
  [    7.000279]init: Caught abort, core dumped
  [    7.467176]Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000600

To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1447756/+subscriptions


References