cuneiform team mailing list archive
-
cuneiform team
-
Mailing list archive
-
Message #00325
User Dictionary work
I have some preliminary support for User Dictionaries working. I have,
for example, successfully gotten 'com' to stop turning into 'corn' by
adding 'com' to a user dictionary.
Since I've never used bzr before, it will take me extra time to create
patches. I wanted to find out how likely they were to be accepted before
I did the work. Here are some of the things that are required:
1) Bugfix in open_data_file so it creates files with reasonable perms
2) Bugfix in voc_write (crash when voc->lev == -1)
3) Many functions in Kern/rling have to be marked as visible in the .so.
They look like they were meant to be visible. For example,
InitializeAlphabet, InitializeNewUserDict, CloseUserDictionary, ...
4) Setting RSTR_pchar_user_dict_name has to load the dictionary(s)
5) cuneiform-cli needs options to select user dictionary(s)
6) A new program cuneiform-dict to turn wordlists into dictionaries
7) Something probably has to be done to allow the dictionaries to
be in a local path instead of always trying /usr/local/share/cuneiform
(applies to creating and loading).
The role of the user dictionary is to help the code select an alternative
when there is more than one possibility for a word. It will not outright
replace words if the code which tries variations does not try your version.
So since the spelart.c code loads a table that knows rn<->m (see rec9.dat)
I was able to tell it 'com' was a word so it would consider it a good
choice. However, I have a document in an odd font where my last name
looks like ']ackson' and OCRs as '3ackson' or '9ackson' and it never even
*tries* 'Jackson' so having it in the dictionary doesn't help.
--
Ben Jackson AD7GD
<ben@xxxxxxx>
http://www.ben.com/
Follow ups