← Back to team overview

cuneiform team mailing list archive

[Bug 344790] Re: OCR quality drops

 

Alexander, thank you for your report! 
I found that quality of recognition changed in some range when I changed linking options of cuneiform.
In modules EXC, RSTR, DIF, RBLOCK - there's a lot of doubling functions and variables, also extern int bla-bla-bla is widely used.
When I removed -fvisibility=hidden some tests show better quality of recognition, but some tests not. 

When I set up compiler flag -fvisibility again on some modules, I found that quality result of my branch became closer to original.
I think that way to better recognition results lay in isolating global variables in modules like EXC, LOC, DIF, RSTR, RBLOCK and removing doubling of code.

Also I'm planing  to make automatic regression tests, like guys from openocr done,  so it will be possible detect regressions after every small change.
But the main goal for me now is:
1. Compiling with MSVC
2. Fix crashes under FreeBSD, NetBSD while recognition with Russian,  Bulgarian and Ukrainian languages
3. Increase more test coverage for my written code

-- 
OCR quality drops
https://bugs.launchpad.net/bugs/344790
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: New

Bug description:
OCR quality drops during porting.
Look at the result of recognition stdj4.tif, line 7, smart text format

was (stdj4.txt.initial)
mli i f r nin. Ithas

is (stdj4.txt.puma)
m li i f r nin Ithas