← Back to team overview

desktop-packages team mailing list archive

[Bug 1527318] [NEW] pdftotxt extraction of accented characters

 

Public bug reported:

To extract text from a PDF file written in Spanish with pdftotxt
function, accented characters (ü,á) are drawn incorrectly.

Example:
Original text => Extracted text
Facultad de Matemática y Computación => Facultad de Matem´tica y Computaci´n
Analizadores Multilingües en FreeLing => Analizadores Multiling¨es en FreeLing

** Affects: poppler (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "FreeLing 3.pdf"
   https://bugs.launchpad.net/bugs/1527318/+attachment/4536318/+files/FreeLing%203.pdf

** Description changed:

  To extract text from a PDF file written in Spanish with pdftotxt
  function, accented characters (ü,á) are drawn incorrectly.
  
  Example:
- Original text                                                                            Extracted text
- Facultad de Matemática y Computación                            Facultad de Matem´tica y Computaci´n
- Analizadores Multilingües en FreeLing                              Analizadores Multiling¨es en FreeLing
+ Original text => Extracted text
+ Facultad de Matemática y Computación => Facultad de Matem´tica y Computaci´n
+ Analizadores Multilingües en FreeLing => Analizadores Multiling¨es en FreeLing

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to poppler in Ubuntu.
https://bugs.launchpad.net/bugs/1527318

Title:
  pdftotxt extraction of accented characters

Status in poppler package in Ubuntu:
  New

Bug description:
  To extract text from a PDF file written in Spanish with pdftotxt
  function, accented characters (ü,á) are drawn incorrectly.

  Example:
  Original text => Extracted text
  Facultad de Matemática y Computación => Facultad de Matem´tica y Computaci´n
  Analizadores Multilingües en FreeLing => Analizadores Multiling¨es en FreeLing

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/1527318/+subscriptions