sikuli-driver team mailing list archive

Thread
Date

[Question #707899]: OCR not recognizing simple text

To: sikuli-driver@xxxxxxxxxxxxxxxxxxx
From: Niklas Jørgensen <question707899@xxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 12 Sep 2023 12:05:34 -0000
Reply-to: question707899@xxxxxxxxxxxxxxxxxxxxx
Sender: noreply@xxxxxxxxxxxxx

New question #707899 on SikuliX:
https://answers.launchpad.net/sikuli/+question/707899

Hey all!

Im having some trouble recognizing some simple text using the inbuilt tesseract OCR in Sikulix. It had a bit of trouble recognizing everything but when I set the language to danish (which is the language i need to use) it helped alot but it keeps mixing SJ/SJ for 5351 or 5531 or something. I've been playing with the OCR.globalOptions for multiple hours without any change in the outcome, heeeelp! The text in the image is -

Text: https://ibb.co/ZhmZGxJ

Is there any way it can be calibrated in a way where it can actually read the SJ/SJ? - are the letters too small or thin maybe?


This is the code that I have in this test project.

Settings.OcrTextSearch = True
Settings.OcrTextRead = True
Settings.OcrLanguage = "dan"

my_options = OCR.Options().oem(OCR.OEM.TESSERACT_ONLY).configs("digits").language("dan")

print(OCR.globalOptions())

print(my_options)


my_region = Region(877,548,167,11)


print(OCR.readWord(my_region,my_options))


print(my_region.text())

Output:

53511 01-01-2009-31-12-2099
[5]153] €01—01—2009—31—12—2099)

-- 
You received this question notification because your team Sikuli Drivers
is an answer contact for SikuliX.