← Back to team overview

sikuli-driver team mailing list archive

Re: [Question #271593]: work with java OCR

 

Question #271593 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/271593

    Status: Open => Answered

RaiMan proposed the following answer:
Thanks for kind feedback and your willingness, to contribute in
answering questions.

OCR is still in a bad shape (not really revised since the first
implementation). Only for version 2 it is planned to improve it and make
it more configurable.

Currently you can do everything, that is based on additional content and
option files in the tessdata folder, according to the Tesseract 3
documentation.

The other option always is (if not time critical), to install your own Tesseract and use it from a SikuliX workflow via command line:
- create an image somehow containing some text
- optionally optimize the image for OCR with some image processing package (like ImageMagick) 
- run the Tesseract command with the appropriate options
- read the resulting textfile

--- version 1.1.0+ basic Image processing to get better OCR results
you can do the following, to bring your captured image to a condition best for Tesseract OCR:
img = capture(someRegion) # get the image from the screen (in memory)
img1 = Image.create(img) # create an with the new Image class (still in memory)
imgGrey = img1.convertImageToGrayscale(img1.get()) # does what it says (still in memory)
imgGreyResized = imgGrey.resize(factor) # resize the image to about 300dpi (usually 3 as factor is sufficient) (still in memory)

... then directly try
text = imgGreyResized.text() # using the SikuliX builtin OCR implementation

...  or use the Tesseract from command line if special options are needed
imgSaved = imgGreyResized.asFile() # write to temp and get the filename

Sorry, but this is not in the docs.

-- 
You received this question notification because your team Sikuli Drivers
is an answer contact for Sikuli.