cuneiform team mailing list archive
-
cuneiform team
-
Mailing list archive
-
Message #00467
new user with questions
greetings,
I am interested in using cuneifrom for OCR on my linux computer. I am running Ubuntu8.04 (Hardy Herron). I have managed to download and compile cuneifrm0.8.0 (obtained from the cuneiform launchpad page).
(a) is this the version of cuneifrom I should be using? Is there a way to use git/svn/... to automatically pull the cuneiform sourcecode as it's updated without downloading the whole tarball each time?
I've been experimenting with the cuneiform I was able to build. I would like to scan books for the bookshare project durring my spare time. At the moement, I am having trouble with the page breaks when scanning a book with the two facing pages.
Currently, when scanning a book with the two facing pages, cunefirm puts the two page headers at the top followed by the contents of both pages one after the other. E.G.
line 1: page number and author
line 2: title and right-hand page number
line 3: typically blank
line 4-end: main text body.
Currently, I can not seperate the two original pages. This is a crucial requirement for submitting books to bookshare. Is it possible to get cuneiform to output all the left-hand text followed by all the right-hand text? E.G.
page number Author
<body of left-hand page>
title page number
<body of right-hand page>
finally, can I append the recognitions of multiple scans to the same file? I tried "cuneiform -f rtf -o test.rtf *.tiff" on a hanful of consecutively numbered image files, but the results continually over-write the previous data and I am left with the results from the last file recognized.
thanks in advance for any help you may provide:-)
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
http://clk.atdmt.com/GBL/go/171222985/direct/01/
Follow ups