← Back to team overview

sikuli-driver team mailing list archive

[Bug 1224811] [NEW] [1.0.1] improvements for text recognition

 

Public bug reported:

------------ small fonts
... more background info in the related question
Improvement in OCR recognition with small images (thanks to Jose Damian)

Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
---------------
if (in_img.rows < MIN_HEIGHT){
   scale = ceil(MIN_HEIGHT / float(in_img.rows));
   copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
---------------

This solution achieves near perfect recognition with my image samples.
I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

** Affects: sikuli
     Importance: Medium
     Assignee: RaiMan (raimund-hocke)
         Status: In Progress

** Description changed:

- I've been doing some OCR text recognition tests with sikuli+tesseract,
- and the results have been quite poor. Probably the main problem is that
- the font in the images is very small (8 pixels high), with a total image
- height of 20 pixels.
- 
- After training tesseract with samples of this font, the images were
- correctly identified by tesseract alone; but when analyzed by sikuli,
- the results were much worse, so it seems that image treatment is
- different in sikuli and in tesseract executable, being the later better
- for small images.
- 
- I think there is a problem with the resize operation that sikuli
- VisionProxy does on images that have a heigh of less than 30 pixels. It
- is currently doing a nearest-neighbor interpolation (INTER_NEAREST),
- when opencv recomends for enlarging images a bicubic interpolation over
- 4x4 pixel neighborhood (INTER_CUBIC).
- 
- Changing this resize operation in tessocr.cpp, improved somewhat my results in sikuli
- ---------------
- if (in_img.rows < MIN_HEIGHT){
-    scale = ceil(MIN_HEIGHT / float(in_img.rows));
-    resize(in_img, out_img, Size(in_img.cols*scale,in_img.rows*scale), 0, 0, INTER_CUBIC);
- ---------------
- 
+ ... more in the related question
  
  Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
  ---------------
  if (in_img.rows < MIN_HEIGHT){
-    scale = ceil(MIN_HEIGHT / float(in_img.rows));
-    copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
+    scale = ceil(MIN_HEIGHT / float(in_img.rows));
+    copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
  ---------------
  
- This solution achieves near perfect recognition with my image samples. 
+ This solution achieves near perfect recognition with my image samples.
  I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

** Description changed:

- ... more in the related question
+ ------------ small fonts
+ ... more background info in the related question
+ Improvement in OCR recognition with small images
  
  Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
  ---------------
  if (in_img.rows < MIN_HEIGHT){
     scale = ceil(MIN_HEIGHT / float(in_img.rows));
     copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
  ---------------
  
  This solution achieves near perfect recognition with my image samples.
  I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

** Description changed:

  ------------ small fonts
  ... more background info in the related question
- Improvement in OCR recognition with small images
+ Improvement in OCR recognition with small images (thanks to Jose Damian)
  
  Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
  ---------------
  if (in_img.rows < MIN_HEIGHT){
     scale = ceil(MIN_HEIGHT / float(in_img.rows));
     copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
  ---------------
  
  This solution achieves near perfect recognition with my image samples.
  I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

** Changed in: sikuli
       Status: New => In Progress

** Changed in: sikuli
   Importance: Undecided => Medium

** Changed in: sikuli
     Assignee: (unassigned) => RaiMan (raimund-hocke)

** Changed in: sikuli
    Milestone: None => 1.1.0

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/1224811

Title:
  [1.0.1] improvements for text recognition

Status in Sikuli:
  In Progress

Bug description:
  ------------ small fonts
  ... more background info in the related question
  Improvement in OCR recognition with small images (thanks to Jose Damian)

  Anyway, I think it is an error to use interpolation to enlarge these small images. Given that it is needed an image height greater than 30 pixels, the best solution for me has been to enlarge the image adding pixels to the border:
  ---------------
  if (in_img.rows < MIN_HEIGHT){
     scale = ceil(MIN_HEIGHT / float(in_img.rows));
     copyMakeBorder (in_img, out_img, 0, (scale-1)*in_img.rows, 0, (scale-1)*in_img.cols, BORDER_REPLICATE);
  ---------------

  This solution achieves near perfect recognition with my image samples.
  I don't know if this is the right treatment for all kinds of small images and could be included in a future release; but at least it is a change that people can try if they have problems recognizing small fonts.

To manage notifications about this bug go to:
https://bugs.launchpad.net/sikuli/+bug/1224811/+subscriptions


Follow ups

References