← Back to team overview

cuneiform team mailing list archive

Re: How better to work with unicode?

 

On Tue, Feb 17, 2009 at 9:46 AM, Dmitry Polevoy
<openocr.polevoy@xxxxxxxxx> wrote:

> At present time I don't want to enhance Cuneiform for full Unicode detection
> support as this task is too complicated.
>
> I need:
> - read openocr output results (UTF-8 AFIK)
> - compare strings
> - sort strings
> - search in strings

If want to compare performance against old versions, I think the
easiest path would be to change the output charset. The command line
client always sets the output encoding to UTF-8. Maybe we could add a
command line switch --native-charset or something similar that outputs
the original characters.



Follow ups

References