← Back to team overview

desktop-packages team mailing list archive

[Bug 177929] Re: Should autodetect filename character encoding in zip files

 

Obviously, the purge-fix above is not the fix but the "break" :). Actually, it seems that the problem is inside the p7zip package. The 7z program does not convert the character coding. It has the "scs" switch to convert the charset but it is not implemented in versions up to the p7zip_9.20.1.
The cyrillic names in Windows-created ZIP file are stored in cp866. Surprisingly it is pretty difficult to add cp866 locale to Ubuntu. The typical solution:
$ sudo /usr/share/locales/install-language-pack ru_RU.CP866
$ sudo locale-gen
fails with silent error on the first command. Other solutions fail too (cannot open locale definition file ru_RU.CP866: No such file or directory). The following command does not list cp866 too:
cat /usr/share/i18n/SUPPORTED | grep ^ru
The easiest way to deal with cp866 is to set LC_CTYPE to C (it means no conversion), and convert the output with iconv.
--
The workaround is attached as a patch. Apply it to /usr/bin/7z.
7z-natspec.patch: autodetects your DOS charset. Requires libnatspec to be installed (see the next attachment).
7z.patch: please set manually windows_codepage (2nd line) to your DOS charset.
--
The solution described is applicable to any region-specific charset.

** Patch added: "Autodetects our DOS charset. Requires libnatspec to be installed."
   https://bugs.launchpad.net/ubuntu/+source/file-roller/+bug/177929/+attachment/2590958/+files/7z-natspec.patch

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to file-roller in Ubuntu.
https://bugs.launchpad.net/bugs/177929

Title:
  Should autodetect filename character encoding in zip files

Status in File Roller:
  Confirmed
Status in “file-roller” package in Ubuntu:
  Triaged
Status in “xarchiver” package in Ubuntu:
  Fix Released

Bug description:
  Binary package hint: file-roller

  Many compressed file such as zip don't specify filename encoding. But
  file-roller read every file as UTF-8, so it sometimes makes encoding
  problem.

  I think may be, you could think that it's not a big problem or it's
  not a bug or we don't need those feature. If you deal with only
  English it's not a problem at all. But it's a big problem for East
  Asian people such as Chinese, Japanese and Korean. Because zip file is
  still popular. And many people compress zip file in Windows, and share
  the zip files. Windows don't use UTF-8.

  Many CJK people want to uncompress these zip files. There are some way
  to solve this problem (by another software). But all are too complex
  to Ubuntu newbies. I wish file-roller support encoding select
  function.

To manage notifications about this bug go to:
https://bugs.launchpad.net/file-roller/+bug/177929/+subscriptions