touch-packages team mailing list archive
-
touch-packages team
-
Mailing list archive
-
Message #73567
[Bug 1421009] Re: unity8 sometimes hangs on boot
** Description changed:
The following gdbus call is failing with a "Error: Timeout was reached"
message:
gdbus call --session --dest com.canonical.UnityGreeter --object-path /
--method org.freedesktop.DBus.Properties.Get com.canonical.UnityGreeter
IsActive
This is being seen on krillin devices starting with image 106 from
ubuntu-touch/devel-proposed. It doesn't happen every time, so far today,
I've seen it 3 times from about 12 tests. On the most recent failure, I
grabbed a console and tried repeatedly to run the command from the
shell, even after 2 hours the timeout was still being returned (after
about 28 seconds).
A copy of ~/.cache/upstart/unity8.log is here:
http://paste.ubuntu.com/10179482/
I have 3 test cases where the problem was observed:
http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-qtchooser/1/console
http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-gsettings-ubuntu-touch-schemas/1/console
http://d-jenkins.ubuntu-ci:8080/job/fjg-boottest/3/console
In all cases, the test is using adt-run (from autopkgtest) to drive a
test on the phone device. adt-run uses the above gdbus call to determine
if the desktop is active. In all the examples, the device was freshly
flashed.
== Test Case ==
# Prepare debugging
adb shell
sudo apt install qtbase5-dbg libc6-dbg libdbus-glib-1-2-dbg dbus-1-dbg libglib2.0-0-dbg
+ echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list
+ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01
+ sudo apt-get update
+ sudo apt install libusermetricsoutput1-dbgsym=1.1.1+15.04.20150219-0ubuntu1
# Start the reboot loop
# This reboots the device in a loop, and if this bug is not fixed by whatever proposed solution, it will hang eventually. Current highest amount of reboots without errors is 54, so it's probable a 100 reboots is needed for testing.
bzr branch lp:unity8
cd unity8
while true; do adb shell rm -R "~phablet/.cache/QML"; ./tools/unlock-device || break; done
# When it fails
adb shell
sudo gdb -p $(pidof unity8)
bt
--
At this point, the backtrace should show:
#0 syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
#1 0xb6301e12 in _q_futex (op=0, val=3, timeout=0x0, addr=<optimized out>)
at thread/qmutex_linux.cpp:146
#2 lockInternal_helper<false> (timeout=-1, elapsedTimer=0x0, d_ptr=...)
at thread/qmutex_linux.cpp:187
#3 QBasicMutex::lockInternal (this=this@entry=0x1523b44)
at thread/qmutex_linux.cpp:203
#4 0xb6301eb6 in lock (this=0x1523b44) at thread/qmutex.h:59
#5 lock (timeout=-1, this=0x1523b38) at thread/qmutex.cpp:620
#6 QMutex::lock (this=this@entry=0x1523d6c) at thread/qmutex.cpp:215
#7 0xb5f39586 in QDBusMutexLocker (m=0x1523d6c, s=0x1523d48,
a=ToggleWatchAction, this=<synthetic pointer>) at qdbusthreaddebug_p.h:183
#8 QDBusDispatchLocker (s=0x1523d48, a=ToggleWatchAction,
this=<synthetic pointer>) at qdbusthreaddebug_p.h:198
#9 qDBusRealToggleWatch (d=0x1523d48, watch=0x1524dd0, fd=46)
at qdbusintegrator.cpp:346
#10 0xb5ae18f6 in ?? () from /lib/arm-linux-gnueabihf/libdbus-1.so.3
With this, it's know that it was a QDBus locking related problem.
--
---
Timeline/Updates:
2015-02-20: libusermetrics lands, causing (apparently) this boot problem to start happening rarely. http://people.canonical.com/~ogra/touch-image-stats/106.changes / http://launchpadlibrarian.net/198152771/libusermetrics_1.1.1%2B14.10.20141020-0ubuntu1_1.1.1%2B15.04.20150219-0ubuntu1.diff.gz ”I got a symbolic trace out of all the threads. It seems to be a dbus lock between usermetrics and networkmanager bits. We suspect a relation to QTBUG https://bugreports.qt.io/browse/QTBUG-44836.”;
2015-03-25: qtbase dbus update to support threads (instead of one main thread) in PPA 018 fixes the boot issue, but autopilot test suites start failing randomly.
2015-03-27: an autopilot fix fixes a simple test case, and seems to fix UITK suite as a whole, but on krillin only
2015-04-10: Further patches from upstream fix all AP tests.
2015-04-23: Upstream continues to work on the patches but they have not yet been merged. AP:s pass, but U1 account gets removed usually after a reboot, even though apps can be installed after adding U1 account flawlessly for the duration of that boot.
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to autopilot in Ubuntu.
https://bugs.launchpad.net/bugs/1421009
Title:
unity8 sometimes hangs on boot
Status in the base for Ubuntu mobile products:
In Progress
Status in autopilot package in Ubuntu:
Fix Released
Status in libusermetrics package in Ubuntu:
Invalid
Status in lxc-android-config package in Ubuntu:
Incomplete
Status in qtbase-opensource-src package in Ubuntu:
In Progress
Status in ubuntu-system-settings-online-accounts package in Ubuntu:
New
Status in unity8 package in Ubuntu:
Invalid
Bug description:
The following gdbus call is failing with a "Error: Timeout was
reached" message:
gdbus call --session --dest com.canonical.UnityGreeter --object-path /
--method org.freedesktop.DBus.Properties.Get
com.canonical.UnityGreeter IsActive
This is being seen on krillin devices starting with image 106 from
ubuntu-touch/devel-proposed. It doesn't happen every time, so far
today, I've seen it 3 times from about 12 tests. On the most recent
failure, I grabbed a console and tried repeatedly to run the command
from the shell, even after 2 hours the timeout was still being
returned (after about 28 seconds).
A copy of ~/.cache/upstart/unity8.log is here:
http://paste.ubuntu.com/10179482/
I have 3 test cases where the problem was observed:
http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-qtchooser/1/console
http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-gsettings-ubuntu-touch-schemas/1/console
http://d-jenkins.ubuntu-ci:8080/job/fjg-boottest/3/console
In all cases, the test is using adt-run (from autopkgtest) to drive a
test on the phone device. adt-run uses the above gdbus call to
determine if the desktop is active. In all the examples, the device
was freshly flashed.
== Test Case ==
# Prepare debugging
adb shell
sudo apt install qtbase5-dbg libc6-dbg libdbus-glib-1-2-dbg dbus-1-dbg libglib2.0-0-dbg
echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01
sudo apt-get update
sudo apt install libusermetricsoutput1-dbgsym=1.1.1+15.04.20150219-0ubuntu1
# Start the reboot loop
# This reboots the device in a loop, and if this bug is not fixed by whatever proposed solution, it will hang eventually. Current highest amount of reboots without errors is 54, so it's probable a 100 reboots is needed for testing.
bzr branch lp:unity8
cd unity8
while true; do adb shell rm -R "~phablet/.cache/QML"; ./tools/unlock-device || break; done
# When it fails
adb shell
sudo gdb -p $(pidof unity8)
bt
--
At this point, the backtrace should show:
#0 syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
#1 0xb6301e12 in _q_futex (op=0, val=3, timeout=0x0, addr=<optimized out>)
at thread/qmutex_linux.cpp:146
#2 lockInternal_helper<false> (timeout=-1, elapsedTimer=0x0, d_ptr=...)
at thread/qmutex_linux.cpp:187
#3 QBasicMutex::lockInternal (this=this@entry=0x1523b44)
at thread/qmutex_linux.cpp:203
#4 0xb6301eb6 in lock (this=0x1523b44) at thread/qmutex.h:59
#5 lock (timeout=-1, this=0x1523b38) at thread/qmutex.cpp:620
#6 QMutex::lock (this=this@entry=0x1523d6c) at thread/qmutex.cpp:215
#7 0xb5f39586 in QDBusMutexLocker (m=0x1523d6c, s=0x1523d48,
a=ToggleWatchAction, this=<synthetic pointer>) at qdbusthreaddebug_p.h:183
#8 QDBusDispatchLocker (s=0x1523d48, a=ToggleWatchAction,
this=<synthetic pointer>) at qdbusthreaddebug_p.h:198
#9 qDBusRealToggleWatch (d=0x1523d48, watch=0x1524dd0, fd=46)
at qdbusintegrator.cpp:346
#10 0xb5ae18f6 in ?? () from /lib/arm-linux-gnueabihf/libdbus-1.so.3
With this, it's know that it was a QDBus locking related problem.
--
---
Timeline/Updates:
2015-02-20: libusermetrics lands, causing (apparently) this boot problem to start happening rarely. http://people.canonical.com/~ogra/touch-image-stats/106.changes / http://launchpadlibrarian.net/198152771/libusermetrics_1.1.1%2B14.10.20141020-0ubuntu1_1.1.1%2B15.04.20150219-0ubuntu1.diff.gz ”I got a symbolic trace out of all the threads. It seems to be a dbus lock between usermetrics and networkmanager bits. We suspect a relation to QTBUG https://bugreports.qt.io/browse/QTBUG-44836.”;
2015-03-25: qtbase dbus update to support threads (instead of one main thread) in PPA 018 fixes the boot issue, but autopilot test suites start failing randomly.
2015-03-27: an autopilot fix fixes a simple test case, and seems to fix UITK suite as a whole, but on krillin only
2015-04-10: Further patches from upstream fix all AP tests.
2015-04-23: Upstream continues to work on the patches but they have not yet been merged. AP:s pass, but U1 account gets removed usually after a reboot, even though apps can be installed after adding U1 account flawlessly for the duration of that boot.
To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1421009/+subscriptions
References