canonical-ubuntu-qa team mailing list archive
-
canonical-ubuntu-qa team
-
Mailing list archive
-
Message #04527
[Bug 2056461]
BTW while I still have time to type things:
(In reply to TJ from comment #7)
> Considering the (changed) source-code of
> fs/9p/vfs_file.c::v9fs_file_read_iter() in commit
> 80105ed2fd2715fb09a8fdb0655a8bdc86c120db.
>
> Prior to the commit there was a code path specifically for O_NONBLOCK (as
> well as for p9L_DIRECT). Now there is only a code-path for P9L_DIRECT.
>
> kernel.36.log shows that within this function the p9L_DIRECT path is taken
> since there is no debug message from the final debug message:
>
> if (fid->mode & P9L_DIRECT)
> return netfs_unbuffered_read_iter(iocb, to);
>
> p9_debug(P9_DEBUG_VFS, "(cached)\n");
> return netfs_file_read_iter(iocb, to);
> }
>
> I'm not familiar enough with the netfs layer to understand if O_NONBLOCK is
> being handled (correctly) there - if special handling is indeed required?
That O_NONBLOCK handling was slightly different; with O_NONBLOCK we used to stop the read loop on first server read result, even if it wasn't 0.
Without O_NONBLOCK we used to loop until eof (e.g. read returns 0)
For this particular case, the O_NONBLOCK handling doesn't matter at all (eofcat will loop anyway), but for synthetic servers that use files as pipes we might want the O_NONBLOCK handling back.
OTOH, that made some programs like tar very slow because they set O_NONBLOCK on regular files when they don't need to; I kind of regret having accepted 52cbee2a5768 ("9p: read only once on O_NONBLOCK") now; we probably should have added a new way of setting it (e.g. at mount time). But right now there's no way of doing that, so Sergey will likely ask again when they upgrade their kernel...
anyway, ETIMEDOUT.
--
You received this bug notification because you are a member of
Canonical's Ubuntu QA, which is subscribed to autopkgtest in Ubuntu.
https://bugs.launchpad.net/bugs/2056461
Title:
autopkgtest-virt-qemu on noble images sometimes hangs doing copydown
Status in Linux:
Confirmed
Status in autopkgtest package in Ubuntu:
Confirmed
Status in linux package in Ubuntu:
In Progress
Status in autopkgtest package in Debian:
New
Bug description:
[Impact]
It seems that kernel 6.8 introduced a regression in the 9pfs related
to caching and netfslib, that can cause some user-space apps to read
content from files that is not up-to-date (when they are used in a
producer/consumer fashion).
It seems that the offending commit is this one:
80105ed2fd27 ("9p: Use netfslib read/write_iter")
Reverting the commit seems to fix the problem. However the actual bug
might be in netfslib or how netfslib is used in the 9p context.
The regression has been reported upstream and we are still
investigating (https://lore.kernel.org/lkml/Zj0ErxVBE3DYT2Ea@gpd/).
In the meantime it probably makes sense to temporarily revert the
commit as a SAUCE patch. Then we will drop the SAUCE patch once we'll
have a proper fix upstream.
[Test case]
The following test should complete correctly without any timeout:
pull-lp-source -d hello
autopkgtest-buildvm-ubuntu-cloud -r noble
autopkgtest -U hello*.dsc -- qemu ./autopkgtest-noble-amd64.img
[Fix]
Revert the following commit (until we have a proper fix upstream):
80105ed2fd27 ("9p: Use netfslib read/write_iter")
[Regression potential]
We may experience other regressions related to 9pfs with this change,
however it's quite unlikely to happen since we are reverting a commit,
restoring the previous behavior.
[Original bug report]
autopkgtest-virt-qemu sometimes hangs when running tests on noble
images. Originally reported by schopin, who also provided a
reproducer:
pull-lp-source -d hello
autopkgtest-buildvm-ubuntu-cloud -r noble
autopkgtest -U hello*.dsc -- qemu ./autopkgtest-noble-amd64.img
I've been able to reproduce it with debugging enabled:
autopkgtest -ddd -U hello_2.10-3.dsc -- qemu --debug --show-boot
/path/to/image
It can get stuck during different stages, but AFAICT always during
"copydown" operations, log excerpts follow. It may be a coincidence,
but this started happening around the time linux-
image-6.8.0-11-generic (6.8.0-11.11) migrated to noble. The testbeds I
used booted 6.6 but then rebooted into that 6.8 kernel after being
upgraded by autopkgtest.
-- logs --
Removing autopkgtest-satdep (0) ...
[...]
autopkgtest-virt-qemu: DBG: executing copydown /tmp/autopkgtest.output.g8v75e8g/tests-tree/ /t/
autopkgtest-virt-qemu: DBG: ['cmdls', "(['tar', '--directory', '/tmp/autopkgtest.output.g8v75e]
autopkgtest-virt-qemu: DBG: ['srcstdin', "<_io.BufferedReader name='/dev/null'>", 'deststdout']
autopkgtest-virt-qemu: DBG: +< tar --directory /tmp/autopkgtest.output.g8v75e8g/tests-tree/ --
autopkgtest-virt-qemu: DBG: +> /tmp/autopkgtest-qemu.ztmr6f5k/runcmd sh -ec if ! test -d /tmp-
autopkgtest-virt-qemu: DBG: +>?
-- or --
autopkgtest: DBG: sending command to testbed: copydown /tmp/autopkgtest.output.c9utq3bx/tests-tree/ /tmp/autopkgtest.H8NDfW/build.DLR/src/
autopkgtest-virt-qemu: DBG: executing copydown /tmp/autopkgtest.output.c9utq3bx/tests-tree/ /tmp/autopkgtest.H8NDfW/build.DLR/src/
autopkgtest-virt-qemu: DBG: ['cmdls', "(['tar', '--directory', '/tmp/autopkgtest.output.c9utq3bx/tests-tree/', '--warning=none', '-c', '.', '-f', '-'], ['/tmp/autopkgtest-qemu.qtkcgg5l/runcm]
autopkgtest-virt-qemu: DBG: ['srcstdin', "<_io.BufferedReader name='/dev/null'>", 'deststdout', "<_io.BufferedReader name='/dev/null'>", 'devnull_read', <_io.BufferedReader name='/dev/null'>]
autopkgtest-virt-qemu: DBG: +< tar --directory /tmp/autopkgtest.output.c9utq3bx/tests-tree/ --warning=none -c . -f -
autopkgtest-virt-qemu: DBG: +> /tmp/autopkgtest-qemu.qtkcgg5l/runcmd sh -ec if ! test -d /tmp/autopkgtest.H8NDfW/build.DLR/src/; then mkdir -- /tmp/autopkgtest.H8NDfW/build.DLR/src/; fi; cd-
autopkgtest-virt-qemu: DBG: +>?
To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/2056461/+subscriptions
References