← Back to team overview

cuneiform team mailing list archive

Re: [Bug 623438] Re: Font size not correct in mergedsandvich PDF

 

Thanks for the response and info.  The main issue with most front ends, Open 
Source or commercial, is that they all tend to be very graphical and not 
accessible to screen readers.  Even reading well formed PDF documents 
directly has been an issue until lately.  I think the latest evince in gnome 
finally starts to address this problem.

Don Marang

There is just so much stuff in the world that, to me, is devoid of any real 
substance, value, and content that I just try to make sure that I am working 
on things that matter.
Dean Kamen


--------------------------------------------------
From: "Yury V. Zaytsev" <yury@xxxxxxxxxx>
Sent: Saturday, September 11, 2010 9:41 AM
To: <cuneiform@xxxxxxxxxxxxxxxxxxx>
Subject: [Cuneiform] [Bug 623438] Re: Font size not correct in 
mergedsandvich PDF

> I am not aware of any open source OCR software that is doing multi-
> column document recognition. It's more of a segmentation task, rather
> than recognition itself, so it should be rather implemented in a front-
> end, such as OCRopus. If you have a linear text flow, sandwich PDFs can
> be read by a screen reader smart enough in a reasonable way.
>
> Apart from already mentioned Finereader, old Cuneiform Windows freeware
> seems to be able to do multi-column.
>
> -- 
> Font size not correct in merged sandvich PDF
> https://bugs.launchpad.net/bugs/623438
> You received this bug notification because you are a member of Cuneiform
> Linux, which is the registrant for Cuneiform for Linux.
>
> Status in Linux port of Cuneiform: Invalid
> Status in “exactimage” package in Ubuntu: New
>
> Bug description:
> After processing with Cuneiform for Linux 1.0.0 and hOCR to PDF converter, 
> version 0.7.4 (should be the most current version) I get a sandvich pdf 
> that looks nice until I select text.
>
> See the sample 5AADFEE1-0000.* files in the attachment and the result.pdf.
> The effect is shown in screen087.png
>
> For another file (Test10pages.pdf) the effect is either worse - basically 
> I cannot really select any more text to copy because I only can guess 
> where to move with the mouse.
>
> It looks like that the font size in the HTML is somehow not correct - I am 
> not an expert, but this link might help you:
> http://www.emdpi.com/fontsize.html
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~cuneiform
> Post to     : cuneiform@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~cuneiform
> More help   : https://help.launchpad.net/ListHelp
>

-- 
Font size not correct in merged sandvich PDF
https://bugs.launchpad.net/bugs/623438
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: Invalid
Status in “exactimage” package in Ubuntu: New

Bug description:
After processing with Cuneiform for Linux 1.0.0 and hOCR to PDF converter, version 0.7.4 (should be the most current version) I get a sandvich pdf that looks nice until I select text.

See the sample 5AADFEE1-0000.* files in the attachment and the result.pdf.
The effect is shown in screen087.png

For another file (Test10pages.pdf) the effect is either worse - basically I cannot really select any more text to copy because I only can guess where to move with the mouse.

It looks like that the font size in the HTML is somehow not correct - I am not an expert, but this link might help you:
http://www.emdpi.com/fontsize.html





References