← Back to team overview

sikuli-driver team mailing list archive

[Bug 1318624] Re: TextRecognizer using transformed capture still give the same results as original image

 

Hi, took me a while, busy period...
here is some of my code, hope it clarifies my issue.
It's groovy, you can't use as it is in java though

The relevant part is near the transformation. If I need to transform the
image (bynarize, sharpen edges...) I then have to use the native
objects, because if I use listText() with the modified BufferedImage the
results would be the same as if I used the original ScreenImage, while
calling Vision methods directly the result are the expected ones.

Anyway I'm happy you are moving to Tess4J, and if so I think you can
ignore this request. Exposing that library is more then enough even for
my partcular needs... I'm even thinking to introduce it myself in my
app... hope I don't get to many conflicts.


def find(screen, strings, options) {
        if (strings instanceof String) {
            strings = [strings]
        }
        if (options?.maxWait) {
            screen.setAutoWaitTimeout(options?.maxWait)
        }
        def match = null
        def imageFound = strings?.find { str ->
            try {
                if (options.transform && options.textSearch) {
                    def img = getRegionImage(screen)
                    def buffImage = transformImage(img.getImage(), options.transform)
                    def ocrText = getText(buffImage)
                    if (debug) {
                        def newImg = createScreenImage(img.getROI(), buffImage)
                        saveImage(img)
                        saveImage(newImg)
                    }
                    if (options.searchByWord) {
                        def words = ocrText.getWords()
                        for(int i=0;i<words.size();i++){
                            def w = words.get(i)
                            if (w.string.matches(str)) {
                                log.debug "w=${w.string} - x:${w.x} - y:${w.y} - w:${w.width} - h:${w.height} MATCHES"
                                match = createMatch(w.x, w.y, w.width, w.height, 100, createScreen(), w.string)
                            } else {
                                log.debug "w=${w.string} - x:${w.x} - y:${w.y} - w:${w.width} - h:${w.height} DOESN'T MATCH"
                            }
                        }
                    } else {
                        def paragraphs = ocrText.getParagraphs()
                        def linedwords = []
                        for(int p=0;p<paragraphs.size();p++){
                            def paragraph = paragraphs.get(p)
                            def lines = paragraph.getLines()
                            for(int l=0;l<lines.size();l++){
                                def line = lines.get(l)
                                def words = line.getWords()
                                def linedword = null
                                for(int i=0;i<words.size();i++){
                                    def word = words.get(i)
                                    if (!linedword) {
                                        linedword = [x:word.x+screen.x, y:word.y+screen.y, height:word.height, width:0, string:""]
                                    } else {
                                        linedword.string += " "
                                    }
                                    linedword.string += word.string
                                    linedword.width += word.width
                                }
                                linedwords << linedword
                            }
         
                        }
                        log.debug "linedwords=${linedwords}"
                        for(int i=0;i<linedwords.size();i++){
                            def w = linedwords.get(i)
                            if (w.string.matches(str)) {
                                log.debug "w=${w.string} - x:${w.x} - y:${w.y} - w:${w.width} - h:${w.height} MATCHES"
                                match = createMatch(w.x, w.y, w.width, w.height, 100, createScreen(), w.string)
                            } else {
                                log.debug "w=${w.string} - x:${w.x} - y:${w.y} - w:${w.width} - h:${w.height} DOESN'T MATCH"
                            }
                        }
                    }
                    
                } else {
                    match = screen.find(str)
                }
            } catch(Exception ex) {
                println "catched exception ${ex}"
                //ex.printStackTrace()
            }
            return match?true:false
        }
        log.debug "found match (${match}) for image ${imageFound} searching ${strings} in ${screen}"
        
        if (!match && options.moveMouse) {
            def location = getMouseLocation()
            if (screen.getRect().contains(location.x, location.y)) {
                options.moveMouse = false
                def newLocation = getRandomOut(screen.getRect(), location)
                println newLocation
                screen.hover(newLocation)
                wt()
                match = find(screen, strings, options)
                wt()
                screen.hover(location)
            }
        }
        if (!match && !options.silent) {
            throw new GroovyRuntimeException("Unable to find any of ${strings} in region screen=${screen}")
        }
        return match
    }

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/1318624

Title:
  TextRecognizer using transformed capture still give the same results
  as original image

Status in Sikuli:
  Incomplete

Bug description:
  Hi,
  I tryed using TextRecognizer listText(ScreenImage simg, Region parent) method passing a newly created ScreenImage that I produce bynarizing the original screen capture so I can read easily a complex image, but the resulting List of Match classes give the same results that are obtained using the original ScreenImage. It seems this method uses the Region instead of the ScreenImage as source for the OCR

To manage notifications about this bug go to:
https://bugs.launchpad.net/sikuli/+bug/1318624/+subscriptions


References