cuneiform team mailing list archive
-
cuneiform team
-
Mailing list archive
-
Message #00318
[Bug 388926] Re: Lithuanian text recognition: wrong recognition of "ų" as an "ę"
These files are actually data files. There is a possibility that source
code for them does not exist --- that is they are in their original
form. Whether or not this is true --- we simply don't know. As Jussi
told, not that much is known of how the code actually works.
--
Lithuanian text recognition: wrong recognition of "ų" as an "ę"
https://bugs.launchpad.net/bugs/388926
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.
Status in Linux port of Cuneiform: New
Bug description:
Using cuneiform 0.7 on Ubuntu 9.04
When ocr-ing a lithuanian text with the switch "-l lit" a large number of letters "ų" that usually go at the end of the word get recognized as "ę".
If someone pointed me to the source file I have to check, I am pretty certain that the solution is simple, as the mistake is very simple. However, I cannot find the file: the closest match - datafiles/*lit.dat are binary and I cannot edit those...
References