← Back to team overview

touch-packages team mailing list archive

[Bug 1473562] Re: Too many crash files kill the device

 

The existing upstart job for whoopsie-upload-all is designed to ensure
that only one whoopsie-upload-all process takes care of the upload
processing on boot.  However, it's true that if there are multiple crash
files in the directory, separate whoopsie-upload-all processes will be
/spawned/ for each crash file, and they will be spawned in parallel. So
while the current behavior is responsible with its use of CPU time, it's
less than ideal with memory usage; each instance of the whoopsie-upload-
all python script uses about 50MB (on amd64), and while some of that is
shared libraries, the stack usage will add up.

I suggest that a simple fix for this would be to change the apport-noui
upstart job to wrap the calls to whoopsie-upload-all with
/lib/udev/watershed.  This would limit us to one running whoopsie-
upload-all process at a time; there would be multiple watershed
processes, but those processes take up much less memory (roughly 300kb
each).

Brian, can you look at adding watershed to this job to see if that
addresses the problems for the phone?

** Changed in: apport (Ubuntu)
   Importance: Undecided => High

** Changed in: apport (Ubuntu)
       Status: New => Triaged

** Changed in: apport (Ubuntu)
     Assignee: Steve Langasek (vorlon) => Brian Murray (brian-murray)

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to apport in Ubuntu.
https://bugs.launchpad.net/bugs/1473562

Title:
  Too many crash files kill the device

Status in Canonical System Image:
  Confirmed
Status in apport package in Ubuntu:
  Triaged

Bug description:
  Tested on krillin.

  TEST CASE:
  1. adb shell to the phone and create a crash file
  $ sh -c 'kill -SEGV $$'
  2. Now create dozens
  $ for n in $(seq 50); do ln /var/crash/_bin_dash.32011.crash /var/crash/_bin_dash_${n}.32011.crash; done
  3. Remove any "upload" and "uploaded" files that have been created and reboot
  $ sudo rm /var/crash/*upload* && sudo reboot

  ACTUAL RESULT
  Lot of whoopsie-upload-all and apport processes are created on boot, consume all the resources of the system and make the phone unbootable or partially functional. OOM killer kills random system tasks such as upstart. Depending on the processes killed, the phone hangs on boot, reboots, dash doesn't come up...

  The number of crashes in this test is a bit excessive but we can
  imagine a scenario where a dozen of crash files are not uploaded
  because the phone is on cellular data, and uploads everything when it
  connects to wifi, disabling the user session.

  A way to recover is to go into recovery and clean /var/crash.

  EXPECTED RESULT
  crash uploads are serialized and can be uploaded only one at a time
  If system resources are already low, the crash file is not uploaded.

To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1473562/+subscriptions