cuneiform team mailing list archive

Thread
Date

Re: How better to work with unicode?

To: cuneiform@xxxxxxxxxxxxxxxxxxx
From: Jussi Pakkanen <jpakkane@xxxxxxxxx>
Date: Mon, 16 Feb 2009 12:15:43 +0200
In-reply-to: <1402a13f0902132356n79d463ffs3aa57ba6b8fd0195@mail.gmail.com>

On Sat, Feb 14, 2009 at 9:56 AM, Dmitry Polevoy
<openocr.polevoy@xxxxxxxxx> wrote:

> Do you know a good (may be standard) library to have a deal with Unicode
> strings and conversions?
> I found ICU (http://icu-project.org/) but I can't understand should we use
> it or use some thing else.

What do you need to do? Output already supports UTF-8. For generic
conversion you can use e.g. iconv.

If you want to enhance Cuneiform for full unicode detection support,
that will be extremely difficult. Currently each letter is stored in
one byte (sometimes signed, sometimes unsigned). The first task would
be to change this to a wchar type or something similar. That is going
to take a lot of very invasive work.

References

How better to work with unicode?
From: Dmitry Polevoy, 2009-02-14