← Back to team overview

cuneiform team mailing list archive

Re: unicode support

 

Jussi Pakkanen wrote:
On Fri, Aug 29, 2008 at 5:17 PM, Alex Samorukov <samm@xxxxxxxxxxx> wrote:

be a very big task to use UTF16 internally. But it will make a sense only in
case of supporting DBCS languages or many different languages on one page
(e.g. russian and polish).

We would also need documentation on how to create data files for new
languages.
I asked Dmitry about this. At brief view some files (rec7*.dat) are dictionaries they are used at windows version to highlight misspelled words, format is more or less described inside speldict.h), some of them (rec8*,rec9*) are spelltab files (see speltab.h).

vital.dat and viteng.dat are referenced only from dc*.dat files. They are not shown in my strace outputs, and I have no idea what they are.

The same story with cube*.dat. There is a string "Printed Digits with ~ 128", but this files are not referenced at all!

rec6*.dat are just alphabet description files (structure is very simple).

rec4* files are in use by p2_cour.c and leo_dll.c, looks like some font detection algorithms (not sure).

rec3, rec2 and rec1 are language related files, see rcm.c for details. I think they are most important for adding new languages.

So, for now we have some candidates to remove from distro:
cube*.dat, vit*.dat. (~13 mb).




References