linuxdcpp-team team mailing list archive

Thread
Date
[Bug 1909861] Re: FileReader is not thread safe on Linux

To: linuxdcpp-team@xxxxxxxxxxxxxxxxxxx
From: cologic <1909861@xxxxxxxxxxxxxxxxxx>
Date: Mon, 04 Jan 2021 07:35:02 -0000
Reply-to: Bug 1909861 <1909861@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Apropos "So possibly the FileReader methods should rather be called
readAsync (overlapped/AIO) and readUnbuffered
(FILE_FLAG_NO_BUFFERING/O_DIRECT/F_NOCACHE). Is buffered reading even
needed there? When would that work if the unbuffered read fails?":

An unbuffered read might depend on buffer alignment, e.g., in NetBSD:
"To meet the alignment requirements for direct I/O, file offset, the length of the I/O and the address of the buffer in memory must all be multiples of DEV_BSIZE (512 bytes)." A buffered read doesn't have that requirement, so might succeed where an unbuffered read fails. There'd still be some value in a buffered fallback, then.

A file reading framework for hashing or uploading might ideally:
- be asynchronous, e.g., OVERLAPPED on Windows, io_uring on Linux, and kqueue() on BSDs/macOS;
- and avoid pointlessly filling disk cache with file contents that will likely not be read again within a relevant timeframe, while still allowing the OS to pre-fetch/readahead cache for blocking calls:
  (1) FILE_FLAG_SEQUENTIAL_SCAN for non-OVERLAPPED, because readahead caching's valuable for blocking operations, and FILE_FLAG_NO_BUFFERING for OVERLAPPED on Windows, per https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew, which states "When FILE_FLAG_NO_BUFFERING is combined with FILE_FLAG_OVERLAPPED, the flags give maximum asynchronous performance, because the I/O does not rely on the synchronous operations of the memory manager. However, some I/O operations take more time, because data is not being held in the cache."
  (2a) POSIX_FADV_SEQUENTIAL potentially combined with POSIX_FADV_DONTNEED for already-read files/sections of file and O_DIRECT/F_NOCACHE when using io_uring/aio/kqueue for similar reasons as FILE_FLAG_NO_BUFFERING and FILE_FLAG_OVERLAPPED work together in Windows; or
  (2b) with a different speed/memory usage tradeoff than (2a), using FILE_FLAG_NO_BUFFERING/O_DIRECT/F_NOCACHE regardless of whether the reads are asynchronous. This is slightly slower in your benchmarks, likely being hurt by disabling readahead/pre-fetch.  Still, it's possible that FILE_FLAG_SEQUENTIAL_SCAN clobbers the OS disk caches when one pushes 20GB of hashed files through it.

If it's fast enough, and explicitly in async cases,
FILE_FLAG_NO_BUFFERING/O_DIRECT/F_NOCACHE might probably on balance be
better, but does create buffer alignment requirements on various
platforms, which might require more testing, and/or a buffered fallback.
As you suggest, I'd be curious too about how well
FILE_FLAG_NO_BUFFERING/O_DIRECT/F_NOCACHE works with high-latency
devices, such as HDDs and slower networks; fast networks with SSDs on
the other end should have lower end-to-end latency than typical HDDs
(~10ms).

https://github.com/facebook/rocksdb/issues/1032#issuecomment-196874121
suggests that rocksdb does (2a), going through the file sequentially and
then explicitly dropping cache behind itself, or did in 2016 at least. I
couldn't verify from their current code that they still do that reading,
though they definitely do it when writing files they're not going to
read back: they write the file, then call with POSIX_FADV_DONTNEED
afterwards.

Asynchronous file I/O is useful here partly because it allows for
avoiding file caching with fewer other tradeoffs, especially in AirDC++,
where one might have multiple hashing threads.

On Windows, DC++, at least, has long used:
size_t FileReader::readDirect(const string& file, const DataCallback& callback) {
	...

	auto tmp = ::CreateFile(tfile.c_str(), GENERIC_READ, FILE_SHARE_READ, nullptr, OPEN_EXISTING,
		FILE_FLAG_NO_BUFFERING | FILE_FLAG_OVERLAPPED | FILE_FLAG_POSIX_SEMANTICS, nullptr);

Capturing the asynchronous case already. As far as I understand, neither
DC++ nor AirDC++ read files asynchronously on any non-Windows platform,
to capture the better case (1). This is likely out of scope of near-term
DC++ development, but since AirDC++ does support non-Windows platforms,
might prove worthwhile.

So, it looks reasonable to have DC++ and AirDC++ on each supported
platform prefer (1), and use one or both of (2a) and (2b), depending on
the speed/fragility hit from (2a). If it's possible to align and size
buffers reliably to match (2a) requirements, then that's probably more
elegant, and if it also doesn't cause too much trouble with higher-
latency I/O devices, it's probably overall more elegant.

** Bug watch added: github.com/facebook/rocksdb/issues #1032
   https://github.com/facebook/rocksdb/issues/1032

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/1909861

Title:
  FileReader is not thread safe on Linux

Status in DC++:
  New

Bug description:
  FileReader::readMapped currently modifies the global SIGBUS handler in
  order to catch read errors:

  https://sourceforge.net/p/dcplusplus/code/ci/default/tree/dcpp/FileReader.cpp#l289

  Since the function can be called concurrently from different threads
  (currently hashing/queue recheck/sfv check in DC++) and each of them
  sets and resets the SIGBUS handler, there's a high risk that the
  application will crash in case of read errors as they aren't being
  handler properly.

  More information about the caveats:
  https://www.sublimetext.com/blog/articles/use-mmap-with-care

  These issues are much more likely to happen with AirDC++ as it uses
  multiple threads for hashing. Read errors caused rather nasty crashes
  with corrupted stack traces for one user, but luckily he was able to
  catch the SIGBUS signal with gdb.

  I didn't even spend time in trying to figure out how to make the
  mapped reads work properly, as based on my testing the basic
  FileReader::readCached function is noticeably faster:

  readMapped: 671 files (21.70 GiB) in 9 directories have been hashed in 4 minutes 21 seconds (84.87 MiB/s)
  readCached: 671 files (21.70 GiB) in 9 directories have been hashed in 3 minutes 58 seconds (93.08 MiB/s)

  FileReader::readMapped is now disabled in AirDC++, as I can't see any
  benefits from using it. The included setjmp.h header is even causing
  issues when using clang for compiling on Linux:
  https://bugs.gentoo.org/731676

To manage notifications about this bug go to:
https://bugs.launchpad.net/dcplusplus/+bug/1909861/+subscriptions
References

[Bug 1909861] [NEW] FileReader is not thread safe on Linux
From: maksis, 2021-01-02