← Back to team overview

cuneiform team mailing list archive

[Bug 623438] Re: Font size not correct in merged sandvich PDF

 

What I can not understand is why you wouldn't file a bug against
hocr2pdf.

As you discovered, Cuneiform exports bboxes for both lines and
characters, so it shouldn't be its fault. So now what we can do for you?
You are not going to get the font metrics. It's very ambiguous and lots
of work. bboxes are by far more than enough to approximately fit the
characters.

And as surprising as it might sound, not everybody is interested in
creating sandwich PDFs. E.g. I don't care. So you have to push it, if
you want to get it solved.

-- 
Font size not correct in merged sandvich PDF
https://bugs.launchpad.net/bugs/623438
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: Invalid

Bug description:
After processing with Cuneiform for Linux 1.0.0 and hOCR to PDF converter, version 0.7.4 (should be the most current version) I get a sandvich pdf that looks nice until I select text.

See the sample 5AADFEE1-0000.* files in the attachment and the result.pdf.
The effect is shown in screen087.png

For another file (Test10pages.pdf) the effect is either worse - basically I cannot really select any more text to copy because I only can guess where to move with the mouse.

It looks like that the font size in the HTML is somehow not correct - I am not an expert, but this link might help you:
http://www.emdpi.com/fontsize.html





References