sikuli-driver team mailing list archive

Thread
Date

[Bug 1891848] Re: UnicodeEncodeError with OCR

To: sikuli-driver@xxxxxxxxxxxxxxxxxxx
From: RaiMan <1891848@xxxxxxxxxxxxxxxxxx>
Date: Tue, 25 Aug 2020 16:12:53 -0000
Reply-to: Bug 1891848 <1891848@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

The OCR/text feature indeed returns a unicode string.

This does not make problems with String operations (as you can see with
your solution).

The only known problem comes up with the print statement, wich throws
this error message, if the string contains non-ascii.

There is a SikuliX uprint() function instead, that can be used in such
cases. It accepts comma separated parameters

** Changed in: sikuli
       Status: New => Confirmed

** Changed in: sikuli
    Milestone: None => 2.0.5

** Changed in: sikuli
     Assignee: (unassigned) => RaiMan (raimund-hocke)

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/1891848

Title:
  Jython scripting: UnicodeEncodeError with OCR --- not a bug, it's a
  feature

Status in Sikuli:
  Confirmed

Bug description:
  Hi!

  I am using Sikulix 2.0.4 on Windows 10 64bit, JAVA 11.

  Sometimes when doing OCR with Reg.text(), following error occurs with
  the resulting string:

  [error] UnicodeEncodeError ( 'ascii' codec can't encode character u'\u201a' in position 0: ordinal not in range(128) )
  [error] --- Traceback --- error source first

  my workaround is to use this function to correct the string:

  def ExtractAlphanumeric(InputString):
      from string import ascii_letters, digits
      return "".join([ch for ch in InputString if ch in (ascii_letters + digits +" ()*")])

  
  I still have the feeling, that the error should not occur in first place.
  REgards
  Michael

To manage notifications about this bug go to:
https://bugs.launchpad.net/sikuli/+bug/1891848/+subscriptions

References

[Bug 1891848] [NEW] UnicodeEncodeError with OCR
From: Michael Böhm, 2020-08-17