← Back to team overview

sikuli-driver team mailing list archive

Re: [Question #660398]: [1.1.x] Changing OCR language from English to something else

 

Question #660398 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/660398

Description changed to:
faq 2709 is now revised:
With SikuliX 1.1.x the internally used version of Tesseract is 3.02, hence you have to use the lang data from here:
https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-302

Be aware: the returned strings using Region.text() are unicode strings.

------------------------------------------------------------------
After reading FAQ #2709 (found here: https://answers.launchpad.net/sikuli/+faq/2709) I decided it would allow me to validate the French text in my Interface.  After following the steps, downloading both the tessdata-master and langdata-master folders from the link in the FAQ, which now redirects here: https://github.com/tesseract-ocr/, etc. I have tried placing only the fra.traineddata into the SikulixTesseract\tessdata folder and placing the entire fra folder from tessdata-master into the tessdata folder. Both times when changing the language and trying to do the read the text in a region (which is now french) the Java Runtime environment simply stops and puts an error messsage in my SikuliX folder.

With the following message

# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x0000000068b89732, pid=5180, tid=0x0000000000001df4
#
# JRE version: Java(TM) SE Runtime Environment (8.0_131-b11) (build 1.8.0_131-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode windows-amd64 compressed oops)
# Problematic frame:
# C  [libtesseract-3.dll+0x189732]
#
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.

Is this error appearing because of something wrong with the setup or am
I handling the TextRecognizer wrong,  putting the wrong language
information in the wrong folders? etc.

-- 
You received this question notification because your team Sikuli Drivers
is an answer contact for Sikuli.