sikuli-driver team mailing list archive

Thread
Date

[Question #240729]: Infinite loop when detecting spaces

To: sikuli-driver@xxxxxxxxxxxxxxxxxxx
From: Eugene S <question240729@xxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 13 Dec 2013 08:01:33 -0000
Reply-to: question240729@xxxxxxxxxxxxxxxxxxxxx
Sender: bounces@xxxxxxxxxxxxx

New question #240729 on Sikuli:
https://answers.launchpad.net/sikuli/+question/240729

Hi all,

As an attempt to create an alternative for built tesseract OCR, I thought about the following idea (high-level):

1. Create a screenshot for each character (screenshot for 'a', screenshot for 'b', etc...)
2. Iterate over each character in a word and compare to a collection of characters screenshots. The one with perfect match - is the letter.

I know it might be not super efficient and/or quick but as long as it provides consistent results, it's enough for me.

So a first challenge would be "segmentation" (character isolation). To do that I thought to detect the spaces between letters assuming a single 1 pixel wide and couple of pixels high bar of empty space as a separator. So I have created a pattern image which is basically a 1xN bar of white pixels.

As a next step I have created an image pattern of a short string of plain text and ran the following algorithm to validate that the gaps between letter are detected correctly:

text = find("sampleText.png") # a short string of text

for x in text.findAll("sampleTextSeparator.png"): # a 1xN bar
x.highlight(1)

However it seems that instead of iterating over all the gaps in this text , the algorithm just finds and highlights the same gap each time. I have tried to count the number the loop is running and it's 100! (it should be 25, including the spaces between words).

Any ideas why such behavior might happen?

Cheers,
Eugene S

--
You received this question notification because you are a member of
Sikuli Drivers, which is an answer contact for Sikuli.