← Back to team overview

cuneiform team mailing list archive

Re: Hocr output status and identified improvements.

 

Thanks, I emailed the ocropus list.

New rev ready to be pulled from. I have tested the hocr output and it works fine.
Now ocr_line folllows the standard according to the hocr ref from 2007 mentioned earlier.
(E.g. the char bboxes are in ocr_cinfo, and the text line is in pure text as text content for the ocr_line tag).

Julien
________________________________________
Från: Yury V. Zaytsev [yury@xxxxxxxxxx]
Skickat: den 2 oktober 2009 10:46
Till: julien
Ämne: Re: [Cuneiform] Hocr output status and identified improvements.

On Thu, 2009-10-01 at 22:44 +0000, julien wrote:
>
> My goal is to standardize the hocr output as much as possible.
> >From what I have understood it originates from the authors of ocropus.
> The standard refered to from ocropus is:
> http://docs.google.com/View?docid=dfxcv4vc_67g844kf

Did you check the OCRopus mailing lists by the way? It might happen that
they have a newer version of the standard which is not yet on the
website...

--
Sincerely yours,
Yury V. Zaytsev




Follow ups

References