← Back to team overview

sikuli-driver team mailing list archive

[Bug 1447076] [NEW] [request] want to handover options to Tesseract (like already possible with language)

 

Public bug reported:

I've been using the OCR features with a certain amount of success, but
inevitably it doesn't work all of the time. In particular the OCR seems
to be much worse as soon as it encounters white text on a dark
background. Internet opinion seems divided as to whether Tesseract is
just like this, or whether it shouldn't be a problem...

I've also been noticing that Region.text() sometimes finds text that
find(text) doesn't.

In any case, I'd like to try to debug this and understand it, and
potentially ask the Tesseract people for tips. Is there some kind of
documentation or guide to how Sikulix uses Tesseract, for example
comparable to the explanation for how it uses OpenCV?  Is there an
equivalent to find(text) in Tesseract, or is that algorithm implemented
entirely in Sikuli?

** Affects: sikuli
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/1447076

Title:
  [request] want to handover options to Tesseract (like already possible
  with language)

Status in Sikuli:
  New

Bug description:
  I've been using the OCR features with a certain amount of success, but
  inevitably it doesn't work all of the time. In particular the OCR
  seems to be much worse as soon as it encounters white text on a dark
  background. Internet opinion seems divided as to whether Tesseract is
  just like this, or whether it shouldn't be a problem...

  I've also been noticing that Region.text() sometimes finds text that
  find(text) doesn't.

  In any case, I'd like to try to debug this and understand it, and
  potentially ask the Tesseract people for tips. Is there some kind of
  documentation or guide to how Sikulix uses Tesseract, for example
  comparable to the explanation for how it uses OpenCV?  Is there an
  equivalent to find(text) in Tesseract, or is that algorithm
  implemented entirely in Sikuli?

To manage notifications about this bug go to:
https://bugs.launchpad.net/sikuli/+bug/1447076/+subscriptions


Follow ups

References