← Back to team overview

kernel-packages team mailing list archive

[Bug 1387214] Re: [TOPBLOCKER] file corruption on touch images in rw portions of the filesystem

 

I might have hit this issue again, I have some files in a git repo on my
phone (mako running devel channel) and when I tried to do a `git commit
-a` just now I got:

error: corrupt loose object '4e1f9449b61f1b1eea1415de21c1c90db0a01f45'
fatal: loose object 4e1f9449b61f1b1eea1415de21c1c90db0a01f45 (stored in .git/objects/4e/1f9449b61f1b1eea1415de21c1c90db0a01f45) is corrupt

This is a 6.5K zlib compressed data file, when I moved it out of the way
and tried to do a `git commit -a` I was prompted to add things to the
repo that were already there, at this point I gave up and checked out a
fresh working copy of the repo.

It is of course possible that this corruption was caused by something
like a `git pull` failing half way though due to a network connection
issue, but I don't remember this happening.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-mako in Ubuntu.
https://bugs.launchpad.net/bugs/1387214

Title:
  [TOPBLOCKER] file corruption on touch images in rw portions of the
  filesystem

Status in the base for Ubuntu mobile products:
  Fix Released
Status in android package in Ubuntu:
  Fix Released
Status in android-tools package in Ubuntu:
  In Progress
Status in initramfs-tools-ubuntu-touch package in Ubuntu:
  Fix Released
Status in linux-mako package in Ubuntu:
  Confirmed
Status in android package in Ubuntu RTM:
  Fix Released
Status in android-tools package in Ubuntu RTM:
  In Progress
Status in initramfs-tools-ubuntu-touch package in Ubuntu RTM:
  Fix Released
Status in linux-mako package in Ubuntu RTM:
  Confirmed

Bug description:
  Symptoms are that cache files in /var/cache/apparmor and profiles in
  /var/lib/apparmor/profiles are sometimes corrupted after a reboot.
  We've already fixed several bugs in the apparmor and click-apparmor
  and made both more robust in the face of corruption and we've reduced
  the impact when there is a corrupted profile, but we've still not
  found the cause of the corruption. This corruption can still affect
  real-world devices: if a profile in /var/lib/apparmor/profiles is
  corrupted and the cache file is out of date, then the profile won't
  compile and that app/scope won't start.

  Workaround: remove the affected profile and then run 'sudo aa-
  clickhook'. This obviously is not viable on an end-user device.

  The investigation is ongoing and this may not be a problem with the
  kernel at all, so this bug may be retargeted to another project.

  The security team and the kernel team have discussed this a lot and
  Colin King is currently looking at this. This bug is just so it can be
  tracked. Here is an excerpt from my latest email to Colin:

  "I believe I have conclusively ruled out apparmor_parser and aa-
  clickhook by creating a new 'home/bug/test-with-true.sh'. Here is the
  test output:

  http://paste.ubuntu.com/8648109/

  Specifically, home/bug/test-with-true.sh changes the interesting parts
  of the algorithm to:

  1. wait for unity8 to start (this ensures the apparmor upstart job is finished)
  2. restore apparmor_parser and aa-clickhook, if needed
  3. if /home/bug/profiles... exists, perform a diff -Naur /home/bug/profiles...
     /var/lib/apparmor/profiles and fail if differences (note, apparmor_parser
     and aa-clickhook were /bin/true during boot so they could not have changed
     /var/lib/apparmor/profiles)
  4. verify the profiles, exit with error if they do not
  5. alternately upgrade/downgrade the packages
  6. verify the profiles, exit with error if they do not
  7. copy the known good profiles in the previous step to /home/bug/profiles...
  8. have apparmor_parser and aa-clickhook point to /bin/true
  9. reboot
  10. go to step 1

  In the paste you'll notice that in step 6 the profiles were
  successfully created by the installation of the packages, then
  verified, then copied aside, then apparmor_parser and aa-clickhook
  diverted, then rebooted, only to have the profiles in
  /var/lib/apparmor/profiles be different than what was copied aside. It
  would be nice to verify on your device as well (I reproduced several
  times here) and verify the reproducer algorithm. I think this suggests
  this is a kernel issue and not userspace.

  IMPORTANT: you will want to update the reproducer and refollow all of these steps (ie, I updated the scripts, the debs, the sudoers file, etc):
  $ wget http://people.canonical.com/~jamie/cking/aa-corruption.tar.gz
  $ tar -zxvf ./aa-corruption.tar.gz
  ...

  $ adb push ./aa-corruption.tar.gz /tmp
  $ adb shell
  phablet@ubuntu-phablet:~$ cd /tmp
  phablet@ubuntu-phablet:~$ tar -zxvf ./aa-corruption.tar.gz
  phablet@ubuntu-phablet:~$ sudo mount -o remount,rw /
  phablet@ubuntu-phablet:~$ sudo cp ./aa-corruption/etc/sudoers.d/phablet
  /etc/sudoers.d/
  phablet@ubuntu-phablet:~$ sudo mount -o remount,ro /
  phablet@ubuntu-phablet:~$ sudo cp -a ./aa-corruption/home/bug /home
  phablet@ubuntu-phablet:~$ exit
  $ cd ./aa-corruption
  $ ./test-from-host.sh
  ...

  The old script is still in place. Simply adjust ./test-from-host.sh to have:
  testscript=/home/bug/test.sh
  #testscript=/home/bug/test-with-true.sh"

  The kernel team has verified the above reproducer and symptoms.

  Related bugs:
  * bug 1371771
  * bug 1371765
  * bug 1377338

To manage notifications about this bug go to:
https://bugs.launchpad.net/canonical-devices-system-image/+bug/1387214/+subscriptions


References