sikuli-driver team mailing list archive
-
sikuli-driver team
-
Mailing list archive
-
Message #44390
Re: [Question #660398]: [1.1.x] Changing OCR language from English to something else
Question #660398 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/660398
Status: Open => Answered
RaiMan proposed the following answer:
ok, again thanks for not giving up.
My bad: if I would have checked instead of guessing, I would have
realized, that Tesseract 3.02 is the correct choice.
I have corrected the answer and faq 2709 accordingly.
Since beginning with Tesseract 3 read text is returned as unicode
string, staying with SikuliX 1.1.0 makes problems, since the contained
Jython 2.5 does not recognize unicode strings.
I recommend, to upgrade to SikuliX 1.1.1 which has Jython 2.7 (unicode aware).
Additionally using Java 7 or even Java 8 (not Java 9 yet!) would be a good choice.
I made a test with german language like this:
import org.sikuli.script.TextRecognizer as TR
Settings.OcrReadText = True
Settings.OcrLanguage = "deu"
TR.reset()
text = selectRegion().text()
uprint(text) # normal print not unicode aware
popup(text) # unicode aware
which worked as expected and printed the "german umlauts" ä, ü, ö
uprint() is a SikuliX helper function, that internally makes unicode
strings printable and can be used like the print statement.
--
You received this question notification because your team Sikuli Drivers
is an answer contact for Sikuli.