sikuli-driver team mailing list archive

Thread
Date

Re: [Question #670961]: Digital Monitors (e.g: Restaurant Menu display Board) automation testing using SikuliX

To: sikuli-driver@xxxxxxxxxxxxxxxxxxx
From: Alex <question670961@xxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 18 Jul 2018 21:32:24 -0000
Reply-to: question670961@xxxxxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

Question #670961 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/670961

Alex posted a new comment:
There are loads of resources online for training Tesseract. It is not a fun operation and is extraordinarily tedious. I would suggest starting with optimizing the the text for OCR:
-Dark (black) text on a white background is best.
-Larger text is better
-Some fonts are simply more OCR friendly.

Big issues are things like capital 'eye', lower case 'ell', and the
number one pipes. Example: I1l| . Depending on your font, that text can
be impossible to read.

Getting back to training Tesseract, I'd suggest googling the process. It's not hard once you have a workflow setup but the process is very specific to your platform and application. I used the following guides:
-https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
-http://pretius.com/how-to-prepare-training-files-for-tesseract-ocr-and-improve-characters-recognition/
-https://medium.com/apegroup-texts/training-tesseract-for-labels-receipts-and-such-690f452e8f79

I used jTessBoxEditor but the are other box editors that I'm sure work
just as well. As I mentioned, training Tesseract is extraordinarily
tedious. I would make that your path of last resort for solving this
problem.

You could also consider just using images. It's tedious as well but has
a very high success rate when configured properly. I've used techniques
in which I matched an image of text and then created a dynamic region to
the right of the found image to read a series of numbers.

Another technique I've used as an 'error dictionary'. In some areas
sikuli will consistently incorrectly read text. It may read the word
'apples' as 'apple5'. I created a function that accepts an OCR text
output and modifies based on known failures. A code example follows:

errorDict = { # known OCR errors
'amplication': 'amplification',
'weathel': 'weather',
...
}

value = re.sub(r'[^a-zA-Z0-9]', '', value) # Strip all non alpha
characters and spaces.

if value in errorDict:
value = errorDict[value]

return value

This allowed us to avoid the Tesseract training in many areas and might
work in your situation. Alternatively, you could always just look for
the bad text. If sikuli reads 'hotdogs' as 'h0td0gs' you could always
just tell it to click the 'bad' spelling of the word.

--
You received this question notification because your team Sikuli Drivers
is an answer contact for Sikuli.