touch-packages team mailing list archive
-
touch-packages team
-
Mailing list archive
-
Message #74102
[Bug 1421009] Re: unity8 sometimes hangs on boot
** Changed in: canonical-devices-system-image
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to autopilot in Ubuntu.
https://bugs.launchpad.net/bugs/1421009
Title:
unity8 sometimes hangs on boot
Status in the base for Ubuntu mobile products:
Fix Released
Status in autopilot package in Ubuntu:
Fix Released
Status in libusermetrics package in Ubuntu:
Fix Released
Status in lxc-android-config package in Ubuntu:
Invalid
Status in qtbase-opensource-src package in Ubuntu:
In Progress
Status in ubuntu-system-settings-online-accounts package in Ubuntu:
New
Status in unity8 package in Ubuntu:
Invalid
Bug description:
The following gdbus call is failing with a "Error: Timeout was
reached" message:
gdbus call --session --dest com.canonical.UnityGreeter --object-path /
--method org.freedesktop.DBus.Properties.Get
com.canonical.UnityGreeter IsActive
This is being seen on krillin devices starting with image 106 from
ubuntu-touch/devel-proposed. It doesn't happen every time, so far
today, I've seen it 3 times from about 12 tests. On the most recent
failure, I grabbed a console and tried repeatedly to run the command
from the shell, even after 2 hours the timeout was still being
returned (after about 28 seconds).
A copy of ~/.cache/upstart/unity8.log is here:
http://paste.ubuntu.com/10179482/
I have 3 test cases where the problem was observed:
http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-qtchooser/1/console
http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-gsettings-ubuntu-touch-schemas/1/console
http://d-jenkins.ubuntu-ci:8080/job/fjg-boottest/3/console
In all cases, the test is using adt-run (from autopkgtest) to drive a
test on the phone device. adt-run uses the above gdbus call to
determine if the desktop is active. In all the examples, the device
was freshly flashed.
== Test Case ==
# Prepare debugging
adb shell
sudo apt-get clean # so that you wouldn't run out of disk space
sudo apt install qtbase5-dbg libc6-dbg libdbus-glib-1-2-dbg dbus-1-dbg libglib2.0-0-dbg
# Add also libusermetrics debug symbols, unless you're testing a PPA version
echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01
sudo apt-get update
sudo apt install libusermetricsoutput1-dbgsym=1.1.1+15.04.20150219-0ubuntu1
# Start the reboot loop
# This reboots the device in a loop, and if this bug is not fixed by whatever proposed solution, it will hang eventually with Unity 8 having a black background. Other kind of hangs (like just Google logo showing, no adb) are not related to this bug. Current highest amount of reboots without errors is 54, so it's probable a 100 reboots is needed for testing.
bzr branch lp:unity8
cd unity8
while true; do adb shell rm -R "~phablet/.cache/QML"; ./tools/unlock-device || break; done
# When it fails
adb shell
sudo gdb -p $(pidof unity8)
bt full
--
At this point, the backtrace should show:
#0 syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
#1 0xb6301e12 in _q_futex (op=0, val=3, timeout=0x0, addr=<optimized out>)
at thread/qmutex_linux.cpp:146
#2 lockInternal_helper<false> (timeout=-1, elapsedTimer=0x0, d_ptr=...)
at thread/qmutex_linux.cpp:187
#3 QBasicMutex::lockInternal (this=this@entry=0x1523b44)
at thread/qmutex_linux.cpp:203
#4 0xb6301eb6 in lock (this=0x1523b44) at thread/qmutex.h:59
#5 lock (timeout=-1, this=0x1523b38) at thread/qmutex.cpp:620
#6 QMutex::lock (this=this@entry=0x1523d6c) at thread/qmutex.cpp:215
#7 0xb5f39586 in QDBusMutexLocker (m=0x1523d6c, s=0x1523d48,
a=ToggleWatchAction, this=<synthetic pointer>) at qdbusthreaddebug_p.h:183
#8 QDBusDispatchLocker (s=0x1523d48, a=ToggleWatchAction,
this=<synthetic pointer>) at qdbusthreaddebug_p.h:198
#9 qDBusRealToggleWatch (d=0x1523d48, watch=0x1524dd0, fd=46)
at qdbusintegrator.cpp:346
#10 0xb5ae18f6 in ?? () from /lib/arm-linux-gnueabihf/libdbus-1.so.3
With this, it's know that it was a QDBus locking related problem.
--
---
Timeline/Updates:
2015-02-20: libusermetrics lands, causing (apparently) this boot problem to start happening rarely. http://people.canonical.com/~ogra/touch-image-stats/106.changes / http://launchpadlibrarian.net/198152771/libusermetrics_1.1.1%2B14.10.20141020-0ubuntu1_1.1.1%2B15.04.20150219-0ubuntu1.diff.gz ”I got a symbolic trace out of all the threads. It seems to be a dbus lock between usermetrics and networkmanager bits. We suspect a relation to QTBUG https://bugreports.qt.io/browse/QTBUG-44836.”;
2015-03-25: qtbase dbus update to support threads (instead of one main thread) in PPA 018 fixes the boot issue, but autopilot test suites start failing randomly.
2015-03-27: an autopilot fix fixes a simple test case, and seems to fix UITK suite as a whole, but on krillin only
2015-04-10: Further patches from upstream fix all AP tests.
2015-04-23: Upstream continues to work on the patches but they have not yet been merged. AP:s pass, but U1 account gets removed usually after a reboot, even though apps can be installed after adding U1 account flawlessly for the duration of that boot.
To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1421009/+subscriptions
References