← Back to team overview

sikuli-driver team mailing list archive

[Bug 1224811] Re: [1.0.1] improvements for text recognition

 

Tesseract works better if you enlarge the image.  
I think it uses black and white for OCR, not even grayscale.  So it gets confused if the gaps between the letters are gray and not black or white.

It'd be useful if a scale argument can be added to text()  or Settings

Right now, I can't see a way to OCR on an enlarged image.  text() only
works on a Region.  Can it be used with an Image?


Here're some tests with the attached image.
The same image, 2x, 3x, scaled using gimp (cubic)  and the results from tesseract 3.02


------Results from tesseract 3.02----

mm lnruwzk hmvm raxmmvm D>l’V nu Wm dafl
um mequlci. hmwn fox lunllkduier u. my an;

u..n-umuuu hvrmn miuumu over uu ht} dug
::»,un. quick hmvm fax ilmlped u... nu lazy dug

um-1.. quick hmmu (ax jumped rner Ilue nu, dug
lfipt The quick brown fox jumped over the lazy dog

ispc The quick brown fox jumped over the lazy dog

32pt The quick brown fox jumped over the lazy dog

Same image resized 2x cubic with gimp...
lllpl Thequick hmwn foxjumped ovtrlhel-.I1_v dog

llpl'l1Ir quick bmwn [ox jumped over lhr lazy dog

12])! The quick hmwn foxjumpcd over the hazy dog
l3pt The quick brown fox jumped over the lazy dog

I-lpt The quick brown fox jumped over the lazy dog
l6pt The quick brown fox jumped over the lazy dog

l8pt The quick brown fox jumped over the lazy (log

32pt The quick brown fox jumped over the lazy dog

Same image resized 3x cubic with gimp...

llilpl The quick brmm foxjumpcd over the lazy dog
Hp! 11:‘ quick brown fox jumped over the Buy dog

I211! The qulek brown foxjumped over the lazy dog
l3pl The quick brown fox jumped over the lazy dog

l-lpt The quick brown fox jumped over the lazy dog


** Attachment added: "quickbrown.png"
   https://bugs.launchpad.net/sikuli/+bug/1224811/+attachment/3841551/+files/quickbrown.png

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/1224811

Title:
  [1.0.1] improvements for text recognition

Status in Sikuli:
  In Progress

Bug description:
  ------------ small fonts
  ... more background info in the related question
  Improvement in OCR recognition with small images (thanks to Jose Damian)

  Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
  ---------------
  if (in_img.rows < MIN_HEIGHT){
     scale = ceil(MIN_HEIGHT / float(in_img.rows));
     copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
  ---------------

  This solution achieves near perfect recognition with my image samples.
  I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

To manage notifications about this bug go to:
https://bugs.launchpad.net/sikuli/+bug/1224811/+subscriptions


References