cuneiform team mailing list archive
-
cuneiform team
-
Mailing list archive
-
Message #00241
Re: How better to work with unicode?
On Sat, Feb 14, 2009 at 9:56 AM, Dmitry Polevoy
<openocr.polevoy@xxxxxxxxx> wrote:
> Do you know a good (may be standard) library to have a deal with Unicode
> strings and conversions?
> I found ICU (http://icu-project.org/) but I can't understand should we use
> it or use some thing else.
What do you need to do? Output already supports UTF-8. For generic
conversion you can use e.g. iconv.
If you want to enhance Cuneiform for full unicode detection support,
that will be extremely difficult. Currently each letter is stored in
one byte (sometimes signed, sometimes unsigned). The first task would
be to change this to a wchar type or something similar. That is going
to take a lot of very invasive work.
References