touch-packages team mailing list archive
-
touch-packages team
-
Mailing list archive
-
Message #89292
[Bug 1473562] Re: Too many crash files kill the device
The existing upstart job for whoopsie-upload-all is designed to ensure
that only one whoopsie-upload-all process takes care of the upload
processing on boot. However, it's true that if there are multiple crash
files in the directory, separate whoopsie-upload-all processes will be
/spawned/ for each crash file, and they will be spawned in parallel. So
while the current behavior is responsible with its use of CPU time, it's
less than ideal with memory usage; each instance of the whoopsie-upload-
all python script uses about 50MB (on amd64), and while some of that is
shared libraries, the stack usage will add up.
I suggest that a simple fix for this would be to change the apport-noui
upstart job to wrap the calls to whoopsie-upload-all with
/lib/udev/watershed. This would limit us to one running whoopsie-
upload-all process at a time; there would be multiple watershed
processes, but those processes take up much less memory (roughly 300kb
each).
Brian, can you look at adding watershed to this job to see if that
addresses the problems for the phone?
** Changed in: apport (Ubuntu)
Importance: Undecided => High
** Changed in: apport (Ubuntu)
Status: New => Triaged
** Changed in: apport (Ubuntu)
Assignee: Steve Langasek (vorlon) => Brian Murray (brian-murray)
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to apport in Ubuntu.
https://bugs.launchpad.net/bugs/1473562
Title:
Too many crash files kill the device
Status in Canonical System Image:
Confirmed
Status in apport package in Ubuntu:
Triaged
Bug description:
Tested on krillin.
TEST CASE:
1. adb shell to the phone and create a crash file
$ sh -c 'kill -SEGV $$'
2. Now create dozens
$ for n in $(seq 50); do ln /var/crash/_bin_dash.32011.crash /var/crash/_bin_dash_${n}.32011.crash; done
3. Remove any "upload" and "uploaded" files that have been created and reboot
$ sudo rm /var/crash/*upload* && sudo reboot
ACTUAL RESULT
Lot of whoopsie-upload-all and apport processes are created on boot, consume all the resources of the system and make the phone unbootable or partially functional. OOM killer kills random system tasks such as upstart. Depending on the processes killed, the phone hangs on boot, reboots, dash doesn't come up...
The number of crashes in this test is a bit excessive but we can
imagine a scenario where a dozen of crash files are not uploaded
because the phone is on cellular data, and uploads everything when it
connects to wifi, disabling the user session.
A way to recover is to go into recovery and clean /var/crash.
EXPECTED RESULT
crash uploads are serialized and can be uploaded only one at a time
If system resources are already low, the crash file is not uploaded.
To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1473562/+subscriptions