← Back to team overview

cuneiform team mailing list archive

new user with questions

 

greetings,
I am interested in using cuneifrom for OCR on my linux computer.  I am running Ubuntu8.04 (Hardy Herron).  I have managed to download and compile cuneifrm0.8.0 (obtained from the cuneiform launchpad page).

(a) is this the version of cuneifrom I should be using?  Is there a way to use git/svn/... to automatically pull the cuneiform sourcecode as it's updated without downloading the whole tarball each time?

I've been experimenting with the cuneiform I was able to build.  I would like to scan books for the bookshare project durring my spare time.  At the moement, I am having trouble with the page breaks when scanning a book with the two facing pages.

Currently, when scanning a book with the two facing pages, cunefirm puts the two page headers at the top followed by the contents of both pages one after the other.  E.G.

line 1: page number and author
line 2: title and right-hand page number
line 3: typically blank
line 4-end: main text body.

Currently, I can not seperate the two original pages.  This is a crucial requirement for submitting books to bookshare.  Is it possible to get cuneiform to output all the left-hand text followed by all the right-hand text?  E.G.

page number  Author

<body of left-hand page>

title page number

<body of right-hand page>

finally, can I append the recognitions of multiple scans to the same file?  I tried "cuneiform -f rtf -o test.rtf *.tiff" on a hanful of consecutively numbered image files, but the results continually over-write the previous data and I am left with the results from the last file recognized.

thanks in advance for any help you may provide:-) 		 	   		  
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
http://clk.atdmt.com/GBL/go/171222985/direct/01/

Follow ups