← Back to team overview

cuneiform team mailing list archive

[Bug 344790] Re: OCR quality drops (mising point)

 

I make small bash script for use with The ISRI OCR Performance Toolkit
(http://www.isri.unlv.edu/ISRI/OCRtk).

Script convert text zone files into one file, make recognize with, and without use dictionary and make accuracy report for each recognized files and final report.
To use script, copy into one directory content of two directory in one  test packages from http://www.isri.unlv.edu/ISRI/OCRtk and program "accsum" and "accuracy" from http://www.isri.unlv.edu/downloads/ftk-1.0.tgz, then run:
 test Z 3B, 
where Z - first letter in extension of zone files, 3B - extension of image files.

After script run you get text report .

I run this script an 3b.tgz and get many cuneiform error such

Unknown DIB format
CTIImageList::AddImage: invalid image info

and

Assertion failed: 0 file /home/mgraf/refactoring/src/lns32/rbambuk.cpp,
line 173

Press <Space> to continue execution, <Esc> to abort


** Attachment added: "bash script"
   http://launchpadlibrarian.net/35456628/test

-- 
OCR quality drops (mising point)
https://bugs.launchpad.net/bugs/344790
You received this bug notification because you are a member of Cuneiform
Linux, which is the registrant for Cuneiform for Linux.

Status in Linux port of Cuneiform: New

Bug description:
OCR quality drops during porting.
Look at the result of recognition stdj4.tif, line 7, smart text format

was (stdj4.txt.initial)
mli i f r nin. Ithas

is (stdj4.txt.puma)
m li i f r nin Ithas