desktop-packages team mailing list archive
-
desktop-packages team
-
Mailing list archive
-
Message #153309
[Bug 1527318] [NEW] pdftotxt extraction of accented characters
Public bug reported:
To extract text from a PDF file written in Spanish with pdftotxt
function, accented characters (ü,á) are drawn incorrectly.
Example:
Original text => Extracted text
Facultad de Matemática y Computación => Facultad de Matem´tica y Computaci´n
Analizadores Multilingües en FreeLing => Analizadores Multiling¨es en FreeLing
** Affects: poppler (Ubuntu)
Importance: Undecided
Status: New
** Attachment added: "FreeLing 3.pdf"
https://bugs.launchpad.net/bugs/1527318/+attachment/4536318/+files/FreeLing%203.pdf
** Description changed:
To extract text from a PDF file written in Spanish with pdftotxt
function, accented characters (ü,á) are drawn incorrectly.
Example:
- Original text Extracted text
- Facultad de Matemática y Computación Facultad de Matem´tica y Computaci´n
- Analizadores Multilingües en FreeLing Analizadores Multiling¨es en FreeLing
+ Original text => Extracted text
+ Facultad de Matemática y Computación => Facultad de Matem´tica y Computaci´n
+ Analizadores Multilingües en FreeLing => Analizadores Multiling¨es en FreeLing
--
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to poppler in Ubuntu.
https://bugs.launchpad.net/bugs/1527318
Title:
pdftotxt extraction of accented characters
Status in poppler package in Ubuntu:
New
Bug description:
To extract text from a PDF file written in Spanish with pdftotxt
function, accented characters (ü,á) are drawn incorrectly.
Example:
Original text => Extracted text
Facultad de Matemática y Computación => Facultad de Matem´tica y Computaci´n
Analizadores Multilingües en FreeLing => Analizadores Multiling¨es en FreeLing
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/poppler/+bug/1527318/+subscriptions