← Back to team overview

touch-packages team mailing list archive

[Bug 1421009] Re: unity8 sometimes hangs on boot

 

** Branch linked: lp:~aacid/unity8/useOwnDbusConnections

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to autopilot in Ubuntu.
https://bugs.launchpad.net/bugs/1421009

Title:
  unity8 sometimes hangs on boot

Status in the base for Ubuntu mobile products:
  In Progress
Status in autopilot package in Ubuntu:
  Fix Released
Status in libusermetrics package in Ubuntu:
  Invalid
Status in lxc-android-config package in Ubuntu:
  Incomplete
Status in qtbase-opensource-src package in Ubuntu:
  In Progress
Status in ubuntu-system-settings-online-accounts package in Ubuntu:
  New
Status in unity8 package in Ubuntu:
  Invalid

Bug description:
  The following gdbus call is failing with a "Error: Timeout was
  reached" message:

  gdbus call --session --dest com.canonical.UnityGreeter --object-path /
  --method org.freedesktop.DBus.Properties.Get
  com.canonical.UnityGreeter IsActive

  This is being seen on krillin devices starting with image 106 from
  ubuntu-touch/devel-proposed. It doesn't happen every time, so far
  today, I've seen it 3 times from about 12 tests. On the most recent
  failure, I grabbed a console and tried repeatedly to run the command
  from the shell, even after 2 hours the timeout was still being
  returned (after about 28 seconds).

  A copy of ~/.cache/upstart/unity8.log is here:
  http://paste.ubuntu.com/10179482/

  I have 3 test cases where the problem was observed:
  http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-qtchooser/1/console
  http://d-jenkins.ubuntu-ci:8080/job/vivid-boottest-gsettings-ubuntu-touch-schemas/1/console
  http://d-jenkins.ubuntu-ci:8080/job/fjg-boottest/3/console

  In all cases, the test is using adt-run (from autopkgtest) to drive a
  test on the phone device. adt-run uses the above gdbus call to
  determine if the desktop is active. In all the examples, the device
  was freshly flashed.

  == Test Case ==

  # Prepare debugging
  adb shell
  sudo apt install qtbase5-dbg libc6-dbg libdbus-glib-1-2-dbg dbus-1-dbg libglib2.0-0-dbg
  echo "deb http://ddebs.ubuntu.com $(lsb_release -cs) main restricted universe multiverse" | sudo tee -a /etc/apt/sources.list.d/ddebs.list
  sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 428D7C01
  sudo apt-get update
  sudo apt install libusermetricsoutput1-dbgsym=1.1.1+15.04.20150219-0ubuntu1

  # Start the reboot loop
  # This reboots the device in a loop, and if this bug is not fixed by whatever proposed solution, it will hang eventually. Current highest amount of reboots without errors is 54, so it's probable a 100 reboots is needed for testing.

  bzr branch lp:unity8
  cd unity8
  while true; do adb shell rm -R "~phablet/.cache/QML"; ./tools/unlock-device || break; done

  # When it fails
  adb shell
  sudo gdb -p $(pidof unity8)
  bt

  --
  At this point, the backtrace should show:
  #0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
  #1  0xb6301e12 in _q_futex (op=0, val=3, timeout=0x0, addr=<optimized out>)
      at thread/qmutex_linux.cpp:146
  #2  lockInternal_helper<false> (timeout=-1, elapsedTimer=0x0, d_ptr=...)
      at thread/qmutex_linux.cpp:187
  #3  QBasicMutex::lockInternal (this=this@entry=0x1523b44)
      at thread/qmutex_linux.cpp:203
  #4  0xb6301eb6 in lock (this=0x1523b44) at thread/qmutex.h:59
  #5  lock (timeout=-1, this=0x1523b38) at thread/qmutex.cpp:620
  #6  QMutex::lock (this=this@entry=0x1523d6c) at thread/qmutex.cpp:215
  #7  0xb5f39586 in QDBusMutexLocker (m=0x1523d6c, s=0x1523d48,
      a=ToggleWatchAction, this=<synthetic pointer>) at qdbusthreaddebug_p.h:183
  #8  QDBusDispatchLocker (s=0x1523d48, a=ToggleWatchAction,
      this=<synthetic pointer>) at qdbusthreaddebug_p.h:198
  #9  qDBusRealToggleWatch (d=0x1523d48, watch=0x1524dd0, fd=46)
      at qdbusintegrator.cpp:346
  #10 0xb5ae18f6 in ?? () from /lib/arm-linux-gnueabihf/libdbus-1.so.3

  With this, it's know that it was a QDBus locking related problem.
  --

  ---

  Timeline/Updates:
  2015-02-20: libusermetrics lands, causing (apparently) this boot problem to start happening rarely. http://people.canonical.com/~ogra/touch-image-stats/106.changes / http://launchpadlibrarian.net/198152771/libusermetrics_1.1.1%2B14.10.20141020-0ubuntu1_1.1.1%2B15.04.20150219-0ubuntu1.diff.gz ”I got a symbolic trace out of all the threads. It seems to be a dbus lock between usermetrics and networkmanager bits. We suspect a relation to QTBUG https://bugreports.qt.io/browse/QTBUG-44836.”;
  2015-03-25: qtbase dbus update to support threads (instead of one main thread) in PPA 018 fixes the boot issue, but autopilot test suites start failing randomly.
  2015-03-27: an autopilot fix fixes a simple test case, and seems to fix UITK suite as a whole, but on krillin only
  2015-04-10: Further patches from upstream fix all AP tests.
  2015-04-23: Upstream continues to work on the patches but they have not yet been merged. AP:s pass, but U1 account gets removed usually after a reboot, even though apps can be installed after adding U1 account flawlessly for the duration of that boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1421009/+subscriptions


References