← Back to team overview

canonical-ubuntu-qa team mailing list archive

[Bug 2056461] Re: autopkgtest-virt-qemu on noble images sometimes hangs doing copydown

 

Performing the verification on Noble.

First, we check to see if we can reproduce the problem.

$ apt policy autopkgtest
autopkgtest:
  Installed: 5.34ubuntu2
  Candidate: 5.34ubuntu2
  Version table:
     5.38ubuntu1~24.04.1 100
        100 http://archive.ubuntu.com/ubuntu noble-proposed/main amd64 Packages
 *** 5.34ubuntu2 500
        500 http://archive.ubuntu.com/ubuntu noble/main amd64 Packages
        100 /var/lib/dpkg/status

After creating a Noble image we run the reproducer:

$ autopkgtest -U -ddd hello*.dsc -- qemu ../autopkgtest-noble-amd64.img
autopkgtest: DBG: autopkgtest options: Namespace(override_control=None, only_tests=[], skip_tests=None, built_binaries=True, architecture=None, packages=['hello_2.10-3build2.dsc'], output_dir=None)
autopkgtest: DBG: virt-runner arguments: ['qemu', '../autopkgtest-noble-amd64.img']
autopkgtest: DBG: actions: [('source', 'hello_2.10-3build2.dsc', True)]
autopkgtest: DBG: build binaries: True
autopkgtest: DBG: testbed init
autopkgtest [12:09:07]: starting date and time: 2024-10-21 12:09:07-0700
autopkgtest [12:09:07]: version 5.34ubuntu2
autopkgtest [12:09:07]: host clean-noble-amd64; command line: /usr/bin/autopkgtest -U -ddd hello_2.10-3build2.dsc -- qemu ../autopkgtest-noble-amd64.img
(truncated for brevity)
autopkgtest: DBG: sending command to testbed: copydown hello_2.10.orig.tar.gz /tmp/autopkgtest.gHgSJl/hello_2.10.orig.tar.gz
autopkgtest: DBG: got reply from testbed: timeout
autopkgtest: DBG: sending command to testbed: auxverb_debug_fail
autopkgtest: DBG: got reply from testbed: ok
autopkgtest: DBG: TestbedFailure sent `auxverb_debug_fail', got `timeout', expected `ok...'
autopkgtest: DBG: testbed stop
autopkgtest: DBG: testbed close, scratch=/tmp/autopkgtest.gHgSJl
autopkgtest: DBG: sending command to testbed: close
qemu-system-x86_64: terminating on signal 15 from pid 7523 (/usr/bin/python3)
autopkgtest: DBG: got reply from testbed: ok
autopkgtest: DBG: sending command to testbed: quit
autopkgtest [12:14:36]: ERROR: testbed failure: sent `auxverb_debug_fail', got `timeout', expected `ok...'

The testbed hung until timeout during copydown, so the bug is replicated.
Now we verify the fix.

$ apt policy autopkgtest
autopkgtest:
  Installed: 5.38ubuntu1~24.04.1
  Candidate: 5.38ubuntu1~24.04.1
  Version table:
 *** 5.38ubuntu1~24.04.1 100
        100 http://archive.ubuntu.com/ubuntu noble-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     5.34ubuntu2 500
        500 http://archive.ubuntu.com/ubuntu noble/main amd64 Packages

We create a fresh Noble image and rerun the reproducer:

$ autopkgtest -U hello*.dsc -- qemu ../autopkgtest-noble-amd64.img
autopkgtest [12:35:03]: starting date and time: 2024-10-21 12:35:03-0700
autopkgtest [12:35:03]: version 5.38ubuntu1~24.04.1
autopkgtest [12:35:03]: host clean-noble-amd64; command line: /usr/bin/autopkgtest -U hello_2.10-3build2.dsc -- qemu ../autopkgtest-noble-amd64.img
autopkgtest [12:35:19]: testbed dpkg architecture: amd64
autopkgtest [12:35:20]: testbed apt version: 2.7.14build2
autopkgtest [12:35:20]: @@@@@@@@@@@@@@@@@@@@ test bed setup
(truncated for brevity)
autopkgtest [12:37:04]: test upstream-tests: [-----------------------
Testing greeting-1 ...
Testing hello-1 ...
Testing last-1 ...
Testing traditional-1 ...
autopkgtest [12:37:05]: test upstream-tests: -----------------------]
autopkgtest [12:37:06]: test upstream-tests:  - - - - - - - - - - results - - - - - - - - - -
upstream-tests       PASS
autopkgtest [12:37:06]: @@@@@@@@@@@@@@@@@@@@ summary
command1             PASS
upstream-tests       PASS

So the test was completed without a timeout. 
This concludes the verification for Noble.

-- 
You received this bug notification because you are a member of
Canonical's Ubuntu QA, which is subscribed to autopkgtest in Ubuntu.
Matching subscriptions: ubuntu-qa-bugs
https://bugs.launchpad.net/bugs/2056461

Title:
  autopkgtest-virt-qemu on noble images sometimes hangs doing copydown

Status in Linux:
  Confirmed
Status in autopkgtest package in Ubuntu:
  Fix Released
Status in linux package in Ubuntu:
  In Progress
Status in autopkgtest source package in Jammy:
  In Progress
Status in linux source package in Jammy:
  New
Status in autopkgtest source package in Noble:
  In Progress
Status in linux source package in Noble:
  New
Status in autopkgtest package in Debian:
  Fix Released

Bug description:
  [Impact]

  It seems that kernel 6.8 introduced a regression in the 9pfs related
  to caching and netfslib, that can cause some user-space apps to read
  content from files that is not up-to-date (when they are used in a
  producer/consumer fashion).

  It seems that the offending commit is this one:

   80105ed2fd27 ("9p: Use netfslib read/write_iter")

  Reverting the commit seems to fix the problem. However the actual bug
  might be in netfslib or how netfslib is used in the 9p context.

  The regression has been reported upstream and we are still
  investigating (https://lore.kernel.org/lkml/Zj0ErxVBE3DYT2Ea@gpd/).

  In the meantime it probably makes sense to temporarily revert the
  commit as a SAUCE patch. Then we will drop the SAUCE patch once we'll
  have a proper fix upstream.

  [Test case]

  The following test should complete correctly without any timeout:

    pull-lp-source -d hello
    autopkgtest-buildvm-ubuntu-cloud -r noble
    autopkgtest -U hello*.dsc -- qemu ./autopkgtest-noble-amd64.img

  
  [Fix]

  Revert the following commit (until we have a proper fix upstream):

   80105ed2fd27 ("9p: Use netfslib read/write_iter")

  [Regression potential]

  We may experience other regressions related to 9pfs with this change,
  however it's quite unlikely to happen since we are reverting a commit,
  restoring the previous behavior.

  [Original bug report]

  autopkgtest-virt-qemu sometimes hangs when running tests on noble
  images. Originally reported by schopin, who also provided a
  reproducer:

    pull-lp-source -d hello
    autopkgtest-buildvm-ubuntu-cloud -r noble
    autopkgtest -U hello*.dsc -- qemu ./autopkgtest-noble-amd64.img

  I've been able to reproduce it with debugging enabled:

    autopkgtest -ddd -U hello_2.10-3.dsc -- qemu --debug --show-boot
  /path/to/image

  It can get stuck during different stages, but AFAICT always during
  "copydown" operations, log excerpts follow. It may be a coincidence,
  but this started happening around the time linux-
  image-6.8.0-11-generic (6.8.0-11.11) migrated to noble. The testbeds I
  used booted 6.6 but then rebooted into that 6.8 kernel after being
  upgraded by autopkgtest.

  -- logs --

  Removing autopkgtest-satdep (0) ...
  [...]
  autopkgtest-virt-qemu: DBG: executing copydown /tmp/autopkgtest.output.g8v75e8g/tests-tree/ /t/
  autopkgtest-virt-qemu: DBG: ['cmdls', "(['tar', '--directory', '/tmp/autopkgtest.output.g8v75e]
  autopkgtest-virt-qemu: DBG: ['srcstdin', "<_io.BufferedReader name='/dev/null'>", 'deststdout']
  autopkgtest-virt-qemu: DBG:  +< tar --directory /tmp/autopkgtest.output.g8v75e8g/tests-tree/ --
  autopkgtest-virt-qemu: DBG:  +> /tmp/autopkgtest-qemu.ztmr6f5k/runcmd sh -ec if ! test -d /tmp-
  autopkgtest-virt-qemu: DBG:  +>?

  -- or --

  autopkgtest: DBG: sending command to testbed: copydown /tmp/autopkgtest.output.c9utq3bx/tests-tree/ /tmp/autopkgtest.H8NDfW/build.DLR/src/
  autopkgtest-virt-qemu: DBG: executing copydown /tmp/autopkgtest.output.c9utq3bx/tests-tree/ /tmp/autopkgtest.H8NDfW/build.DLR/src/
  autopkgtest-virt-qemu: DBG: ['cmdls', "(['tar', '--directory', '/tmp/autopkgtest.output.c9utq3bx/tests-tree/', '--warning=none', '-c', '.', '-f', '-'], ['/tmp/autopkgtest-qemu.qtkcgg5l/runcm]
  autopkgtest-virt-qemu: DBG: ['srcstdin', "<_io.BufferedReader name='/dev/null'>", 'deststdout', "<_io.BufferedReader name='/dev/null'>", 'devnull_read', <_io.BufferedReader name='/dev/null'>]
  autopkgtest-virt-qemu: DBG:  +< tar --directory /tmp/autopkgtest.output.c9utq3bx/tests-tree/ --warning=none -c . -f -
  autopkgtest-virt-qemu: DBG:  +> /tmp/autopkgtest-qemu.qtkcgg5l/runcmd sh -ec if ! test -d /tmp/autopkgtest.H8NDfW/build.DLR/src/; then mkdir -- /tmp/autopkgtest.H8NDfW/build.DLR/src/; fi; cd-
  autopkgtest-virt-qemu: DBG:  +>?

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/2056461/+subscriptions



References