cuneiform team mailing list archive
-
cuneiform team
-
Mailing list archive
-
Message #00546
[Bug 548801] [NEW] Hocr has corrupted bounding boxes for images
Public bug reported:
When using HOCR output format the resulting file includes images as
ocr_lines but always setting their bounding boxes to zeros, e.g.:
<p>
<span class='ocr_line' id='line_5' title="bbox 0 0 0 0"><img src=bug_files/1.bmp width=756 height=552 alt="bug_files/1.bmp">
</span>
</p>
Hence, it's not possible to correctly place images on the resulting
page.
Command used (image file attached):
cuneiform -l ita -f hocr -o bug.htm Bug.png
** Affects: cuneiform-linux
Importance: Undecided
Status: New
--
Hocr has corrupted bounding boxes for images
https://bugs.launchpad.net/bugs/548801
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.
Status in Linux port of Cuneiform: New
Bug description:
When using HOCR output format the resulting file includes images as ocr_lines but always setting their bounding boxes to zeros, e.g.:
<p>
<span class='ocr_line' id='line_5' title="bbox 0 0 0 0"><img src=bug_files/1.bmp width=756 height=552 alt="bug_files/1.bmp">
</span>
</p>
Hence, it's not possible to correctly place images on the resulting page.
Command used (image file attached):
cuneiform -l ita -f hocr -o bug.htm Bug.png
Follow ups
References