← Back to team overview

linuxdcpp-team team mailing list archive

[Bug 1649066] Re: Invalid UTF-8 data is not always being rejected

 

good point about favs. maybe the download queue & hash files would be of
concern as well?

could we handle them through some upgrade step? could be done outside of the XML parsing code, launched for these "precious" files when the XML parser encounters a validation error?
some best effort to replace by question marks or ascii equivalents, and an error message when all else fails... it is important no user loses settings (especially favorites) when upgrading.

-- 
You received this bug notification because you are a member of
Dcplusplus-team, which is subscribed to DC++.
https://bugs.launchpad.net/bugs/1649066

Title:
  Invalid UTF-8 data is not always being rejected

Status in AirDC++:
  New
Status in DC++:
  New

Bug description:
  There are various cases where invalid UTF-8 data is being consumed by
  the core:

  1. Text::convert will return the original string in case of errors (Linux only, respective Windows-specific functions will return an empty string in case of errors)
  2. When using "utf-8" encoding in NMDC hubs, the original string will always be returned by conversion functions without validation (generally Linux only since "utf-8" can't be selected from DC++'s GUI)
  3. UTF-8 validation is not performed for strings parsed from XML (specifically file/directory names in filelists)

  This will cause issues especially when the data is processed by
  external sources/libraries that expect to receive valid UTF-8 data
  (https://github.com/airdcpp-web/airdcpp-webclient/issues/204). I'm not
  sure about security implications.

  Another note: messages that fail UTF-8 validation in ADC hubs are
  ignored silently. At least Flexhub seems to be having problems with
  data validation which currently goes unnoticed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/airdcpp/+bug/1649066/+subscriptions


References