cuneiform team mailing list archive

Thread
Date

Re: How better to work with unicode?

To: cuneiform@xxxxxxxxxxxxxxxxxxx
From: Dmitry Polevoy <openocr.polevoy@xxxxxxxxx>
Date: Tue, 17 Feb 2009 18:35:58 +0300
In-reply-to: <42d23b2e0902170402h62835e97ye97aef277ed49c73@mail.gmail.com>

If it is possible I prefer more common instruments (for multylanguage
documents and so on).
To use charset in output is a simple and fast solution. If it would be
difficult to use Unicode I am going to follow you good advise.

2009/2/17 Jussi Pakkanen <jpakkane@xxxxxxxxx>

> On Tue, Feb 17, 2009 at 9:46 AM, Dmitry Polevoy
> <openocr.polevoy@xxxxxxxxx> wrote:
>
> > At present time I don't want to enhance Cuneiform for full Unicode
> detection
> > support as this task is too complicated.
> >
> > I need:
> > - read openocr output results (UTF-8 AFIK)
> > - compare strings
> > - sort strings
> > - search in strings
>
> If want to compare performance against old versions, I think the
> easiest path would be to change the output charset. The command line
> client always sets the output encoding to UTF-8. Maybe we could add a
> command line switch --native-charset or something similar that outputs
> the original characters.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~cuneiform<https://launchpad.net/%7Ecuneiform>
> Post to     : cuneiform@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~cuneiform<https://launchpad.net/%7Ecuneiform>
> More help   : https://help.launchpad.net/ListHelp
>

References

How better to work with unicode?
From: Dmitry Polevoy, 2009-02-14
Re: How better to work with unicode?
From: Jussi Pakkanen, 2009-02-16
Re: How better to work with unicode?
From: Jussi Pakkanen, 2009-02-17