cuneiform team mailing list archive
-
cuneiform team
-
Mailing list archive
-
Message #00244
Re: How better to work with unicode?
If it is possible I prefer more common instruments (for multylanguage
documents and so on).
To use charset in output is a simple and fast solution. If it would be
difficult to use Unicode I am going to follow you good advise.
2009/2/17 Jussi Pakkanen <jpakkane@xxxxxxxxx>
> On Tue, Feb 17, 2009 at 9:46 AM, Dmitry Polevoy
> <openocr.polevoy@xxxxxxxxx> wrote:
>
> > At present time I don't want to enhance Cuneiform for full Unicode
> detection
> > support as this task is too complicated.
> >
> > I need:
> > - read openocr output results (UTF-8 AFIK)
> > - compare strings
> > - sort strings
> > - search in strings
>
> If want to compare performance against old versions, I think the
> easiest path would be to change the output charset. The command line
> client always sets the output encoding to UTF-8. Maybe we could add a
> command line switch --native-charset or something similar that outputs
> the original characters.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~cuneiform<https://launchpad.net/%7Ecuneiform>
> Post to : cuneiform@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~cuneiform<https://launchpad.net/%7Ecuneiform>
> More help : https://help.launchpad.net/ListHelp
>
References