← Back to team overview

sikuli-driver team mailing list archive

[Bug 1408937] Re: [request] Self learning: Should be possible to recapture a not found image on the fly and retry

 

Yea, I was thinking about that some years ago when I first discovered
Sikuli. The reason that I had to train Sikuli for different systems and
sometimes the time necessary to write new script was 5x times higher
that just clicking the stuff for a month. So my automation with "manual
learning" (writing scripts) needed 5 months to feel the result, and
worse thing - it constantly needed improvements. So, I've abandoned
Sikuli at that moment - the pivot to go for Java was also a reason to
abandon it - I don't have type for prototyping in Java.

So, over these years the idea stayed the same, but now I know a little
bit more about underlying technology that can enable it. I think that we
need - http://caffe.berkeleyvision.org/ or
http://deeplearning.net/software/theano/ plus speed up on GPU. But
that's about technology. The overall scheme:

Sikuli maintains a database (or better said - matrix) of images that is
known to it. Yes, that means a lot of data. Nobody says that these
should be images like we do now. We don't store images in our brain - we
store the effect from that images. So, the images are just the source
data. The request to database - I will call it 'matrix' (just to keep
everyone confused) - can answer the questions:

1/ if the current state of the image on the screen is known to it or not
2/ if there any known areas
3/ if there any known areas, with unusual content
4/ if there are any unknown areas
5/ if there is any unknown state

So, the idea is that system SHOULD KNOW ABOUT ALL states. The non-
important states can be generalized, but the idea is that system ALWAYS
AWARE of what's going on on the screen, and just chooses to react or
not.

So, 'area' is the thing that is specific for us. If we are operating
with GUI windows, we need to train the concept of window. This trained
data can then be contained in some 'domain' (specific part of matrix),
so that Sikuli can count windows without doing a second guess about
their content. So, it is a layered concept model, much like in humans.

1. Generic screen - I see something and don't see 
2. Windows - I see windows, I know how it looks like
3. Context - I know windows, I track what is inside, how many of them and where are they

The things that can help to understand me are:
 - HMM (hidden markov models) used in speech synthesis/recognition and parameters
 - predator - self learning algorithm in video image recognition

So, we need to select how to we train system to build a loop.

    [observe]  -->  [detect]  --> [action] --> [learn] -->  <-- repeat

[observe] is just screen watching and matching to matrix.
[detect] is evaluating the condition of the system - is it 'worried' that
there is something unknown or something is not a good match.
[action] is taken AUTOMATICALLY when system is 99.9% sure of
that's going on (e.g. it did that in those conditions 1000 times and
everything was ok).. or if system is not sure, a person can be ASKED
FOR ADVICE. 
[learn] depending on the outcome of person's feedback, the new
information about the image and conditions and the desired
outcome is saved in data (associated into matrix).

Yep. I would do this in Python, because it doesn't require
recompilation and is much much faster for prototyping. I think we
don't have such chances with Java - I will spend months just trying
to call one library from another.

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/1408937

Title:
  [request] Self learning: Should be possible to recapture a not found
  image on the fly and retry

Status in Sikuli:
  In Progress

Bug description:
  Hi,

  I was thinking of how machine learning can be used to improve the automated scripts and how we can make the script learn.
  I automate SAP PC resolutions and no matter what logic you write, there will be some day that an unexpected error will come.
  We can not control that.
  But I was thinking, what if we can write a logic so that if suppose a screenshot for error is not recognized and an exception occurs:
  1) A screen pop ups asking you to maybe take a screenshot of the new error.
  2) It saves that screenshot and then asks for you to select some options which will be predefined..like 1)repeat 2)wait ...so on
  3)next time when the script runs it won't ask you for this type of error.

  Although I can keep on updating the code when the error occurs,but I
  would like to implement this so that any other non programmer is also
  able to easily update the steps and update the script.

  please suggest if someone know any idea to implement it ,or any other
  idea that is actually self learning.

To manage notifications about this bug go to:
https://bugs.launchpad.net/sikuli/+bug/1408937/+subscriptions


References