cuneiform team mailing list archive

Thread
Date

Re: unicode support

To: Jussi Pakkanen <jpakkane@xxxxxxxxx>
From: Alex Samorukov <samm@xxxxxxxxxxx>
Date: Sun, 31 Aug 2008 11:25:33 +0200
Cc: cuneiform@xxxxxxxxxxxxxxxxxxx
In-reply-to: <42d23b2e0808301238l14777e5h65c9c60a45a44d74@mail.gmail.com>
User-agent: Thunderbird 2.0.0.16 (X11/20080724)

Jussi Pakkanen wrote:

On Fri, Aug 29, 2008 at 5:17 PM, Alex Samorukov <samm@xxxxxxxxxxx> wrote:

be a very big task to use UTF16 internally. But it will make a sense only in
case of supporting DBCS languages or many different languages on one page
(e.g. russian and polish).


We would also need documentation on how to create data files for new
languages.

I asked Dmitry about this. At brief view some files (rec7*.dat) aredictionaries they are used at windows version to highlight misspelledwords, format is more or less described inside speldict.h), some of them(rec8*,rec9*) are spelltab files (see speltab.h).

vital.dat and viteng.dat are referenced only from dc*.dat files. Theyare not shown in my strace outputs, and I have no idea what they are.

The same story with cube*.dat. There is a string "Printed Digits with ~128", but this files are not referenced at all!


rec6*.dat are just alphabet description files (structure is very simple).

rec4* files are in use by p2_cour.c and leo_dll.c, looks like some fontdetection algorithms (not sure).

rec3, rec2 and rec1 are language related files, see rcm.c for details. Ithink they are most important for adding new languages.


So, for now we have some candidates to remove from distro:
cube*.dat, vit*.dat. (~13 mb).

References

unicode support
From: Alex Samorukov, 2008-08-29
Re: unicode support
From: Jussi Pakkanen, 2008-08-30