← Back to team overview

touch-packages team mailing list archive

[Bug 1421009] Re: unity8 sometimes hangs on boot

 

** Changed in: canonical-devices-system-image
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to autopilot in Ubuntu.
https://bugs.launchpad.net/bugs/1421009

Title:
  unity8 sometimes hangs on boot

Status in the base for Ubuntu mobile products:
  Fix Released
Status in autopilot package in Ubuntu:
  Fix Released
Status in libusermetrics package in Ubuntu:
  Fix Released
Status in lxc-android-config package in Ubuntu:
  Invalid
Status in qtbase-opensource-src package in Ubuntu:
  In Progress
Status in ubuntu-system-settings-online-accounts package in Ubuntu:
  New
Status in unity8 package in Ubuntu:
  Invalid

Bug description:
  The following gdbus call is failing with a "Error: Timeout was
  reached" message:

  gdbus call --session --dest com.canonical.UnityGreeter --object-path /
  --method org.freedesktop.DBus.Properties.Get
  com.canonical.UnityGreeter IsActive

  This is being seen on krillin devices starting with image 106 from
  ubuntu-touch/devel-proposed. It doesn't happen every time, so far
  today, I've seen it 3 times from about 12 tests. On the most recent
  failure, I grabbed a console and tried repeatedly to run the command
  from the shell, even after 2 hours the timeout was still being
  returned (after about 28 seconds).

  A copy of ~/.cache/upstart/unity8.log is here:
  http://paste.ubuntu.com/10179482/

  I have 3 test cases where the problem was observed:
  http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-qtchooser/1/console
  http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-gsettings-ubuntu-touch-schemas/1/console
  http://d-jenkins.ubuntu-ci:8080/job/fjg-boottest/3/console

  In all cases, the test is using adt-run (from autopkgtest) to drive a
  test on the phone device. adt-run uses the above gdbus call to
  determine if the desktop is active. In all the examples, the device
  was freshly flashed.

  == Test Case ==

  # Prepare debugging
  adb shell
  sudo apt-get clean # so that you wouldn't run out of disk space
  sudo apt install qtbase5-dbg libc6-dbg libdbus-glib-1-2-dbg dbus-1-dbg libglib2.0-0-dbg

  # Add also libusermetrics debug symbols, unless you're testing a PPA version
  echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list
  sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01
  sudo apt-get update
  sudo apt install libusermetricsoutput1-dbgsym=1.1.1+15.04.20150219-0ubuntu1

  # Start the reboot loop
  # This reboots the device in a loop, and if this bug is not fixed by whatever proposed solution, it will hang eventually with Unity 8 having a black background. Other kind of hangs (like just Google logo showing, no adb) are not related to this bug. Current highest amount of reboots without errors is 54, so it's probable a 100 reboots is needed for testing.

  bzr branch lp:unity8
  cd unity8
  while true; do adb shell rm -R "~phablet/.cache/QML"; ./tools/unlock-device || break; done

  # When it fails
  adb shell
  sudo gdb -p $(pidof unity8)
  bt full

  --
  At this point, the backtrace should show:
  #0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
  #1  0xb6301e12 in _q_futex (op=0, val=3, timeout=0x0, addr=<optimized out>)
      at thread/qmutex_linux.cpp:146
  #2  lockInternal_helper<false> (timeout=-1, elapsedTimer=0x0, d_ptr=...)
      at thread/qmutex_linux.cpp:187
  #3  QBasicMutex::lockInternal (this=this@entry=0x1523b44)
      at thread/qmutex_linux.cpp:203
  #4  0xb6301eb6 in lock (this=0x1523b44) at thread/qmutex.h:59
  #5  lock (timeout=-1, this=0x1523b38) at thread/qmutex.cpp:620
  #6  QMutex::lock (this=this@entry=0x1523d6c) at thread/qmutex.cpp:215
  #7  0xb5f39586 in QDBusMutexLocker (m=0x1523d6c, s=0x1523d48,
      a=ToggleWatchAction, this=<synthetic pointer>) at qdbusthreaddebug_p.h:183
  #8  QDBusDispatchLocker (s=0x1523d48, a=ToggleWatchAction,
      this=<synthetic pointer>) at qdbusthreaddebug_p.h:198
  #9  qDBusRealToggleWatch (d=0x1523d48, watch=0x1524dd0, fd=46)
      at qdbusintegrator.cpp:346
  #10 0xb5ae18f6 in ?? () from /lib/arm-linux-gnueabihf/libdbus-1.so.3

  With this, it's know that it was a QDBus locking related problem.
  --

  ---

  Timeline/Updates:
  2015-02-20: libusermetrics lands, causing (apparently) this boot problem to start happening rarely. http://people.canonical.com/~ogra/touch-image-stats/106.changes / http://launchpadlibrarian.net/198152771/libusermetrics_1.1.1%2B14.10.20141020-0ubuntu1_1.1.1%2B15.04.20150219-0ubuntu1.diff.gz ”I got a symbolic trace out of all the threads. It seems to be a dbus lock between usermetrics and networkmanager bits. We suspect a relation to QTBUG https://bugreports.qt.io/browse/QTBUG-44836.”;
  2015-03-25: qtbase dbus update to support threads (instead of one main thread) in PPA 018 fixes the boot issue, but autopilot test suites start failing randomly.
  2015-03-27: an autopilot fix fixes a simple test case, and seems to fix UITK suite as a whole, but on krillin only
  2015-04-10: Further patches from upstream fix all AP tests.
  2015-04-23: Upstream continues to work on the patches but they have not yet been merged. AP:s pass, but U1 account gets removed usually after a reboot, even though apps can be installed after adding U1 account flawlessly for the duration of that boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1421009/+subscriptions


References