← Back to team overview

linuxdcpp-team team mailing list archive

[Bug 1909861] Re: FileReader is not thread safe on Linux

 

It appears that pre-uring AIO is being supplanted by io_uring
(https://unixism.net/2020/04/io-uring-by-example-part-1-introduction/
and https://www.scylladb.com/2020/05/05/how-io_uring-and-ebpf-will-
revolutionize-programming-in-linux/). AIO has had persistent critiques
since it was introduced (e.g., https://lwn.net/Articles/724198/), and as
best as I can tell appears to have been tolerated in the absence of
anything better until Linux 5.1, which added io_uring.

Asynchronous Linux file I/O is in an unfortunate limbo at the moment,
where the ubiquitously supported approach is being displaced quickly
enough it's unclear whether it's worth targeting specifically anymore.
The usual approach appears to be to use libuv or libevent to abstract
over these interfaces, both across Linux kernel versions and operating
systems more broadly.

For the BSDs (Open/Free/Net/Dragonfly) and macOS, there's kqueue() with
EVFILT_READ, which has been around long enough it can simply be required
as part of the condition of supporting those OSes. Except for macOS,
these are probably quite niche even within the DC userbase, but since
the kqueue() appears to operate similarly across all of them, a single
approach should work.

Therefore, to the extent that readMapped() was meant partly to introduce
asynchrony even when readDirect() didn't work, the in-principle-better
approach would seem still be to drop readMapped() and implement
readDirect() for non-Windows OSes using io_uring and/or kqueue(),
potentially through existing libraries such as libevent or libuv if
appropriate.

Strictly in terms of addressing this bug that you've reported for DC++,
I probably wouldn't wait for such steps. Given that DC++ does not
officially support any non-Windows platform, it seems fine to use, in
your terminology, readAsync with a readUnbuffered fallback, without any
readMapped().

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/1909861

Title:
  FileReader is not thread safe on Linux

Status in DC++:
  New

Bug description:
  FileReader::readMapped currently modifies the global SIGBUS handler in
  order to catch read errors:

  https://sourceforge.net/p/dcplusplus/code/ci/default/tree/dcpp/FileReader.cpp#l289

  Since the function can be called concurrently from different threads
  (currently hashing/queue recheck/sfv check in DC++) and each of them
  sets and resets the SIGBUS handler, there's a high risk that the
  application will crash in case of read errors as they aren't being
  handler properly.

  More information about the caveats:
  https://www.sublimetext.com/blog/articles/use-mmap-with-care

  These issues are much more likely to happen with AirDC++ as it uses
  multiple threads for hashing. Read errors caused rather nasty crashes
  with corrupted stack traces for one user, but luckily he was able to
  catch the SIGBUS signal with gdb.

  I didn't even spend time in trying to figure out how to make the
  mapped reads work properly, as based on my testing the basic
  FileReader::readCached function is noticeably faster:

  readMapped: 671 files (21.70 GiB) in 9 directories have been hashed in 4 minutes 21 seconds (84.87 MiB/s)
  readCached: 671 files (21.70 GiB) in 9 directories have been hashed in 3 minutes 58 seconds (93.08 MiB/s)

  FileReader::readMapped is now disabled in AirDC++, as I can't see any
  benefits from using it. The included setjmp.h header is even causing
  issues when using clang for compiling on Linux:
  https://bugs.gentoo.org/731676

To manage notifications about this bug go to:
https://bugs.launchpad.net/dcplusplus/+bug/1909861/+subscriptions


References