← Back to team overview

sikuli-driver team mailing list archive

Re: [Question #234883]: [research] Sikuli over WebDriver API / JSONWireProtocol?

 

Question #234883 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/234883

daluu posted a new comment:
What do you mean by workflow specifically?

In a nutshell we'd start up a Sikuli-aware Selenium server (that drives
Sikuli, not browsers/Selenium), it could be a JAR file, a binary, or
however it is implemented and takes optional command line arguments like
default image repository location.

This server listens for WebDriver commands (over HTTP via the
JSONWireProtocol), decodes the command and maps it to the appropriate
Sikuli command and executes the Sikuli command, gets the Sikuli return
value (if any) and maps it back to WebDriver command response to send
back to the WebDriver client. Any exceptions are propagated back to the
WebDriver client, perhaps  preprocessed to clean up the error messaging
to make more sense WebDriver style.

Starting up the Sikuli Selenium server may instantiate a Sikuli instance
as needed to do that automation.

So that's the general workflow in my mind. As for example WebDriver
command usage:

//setting default Sikuli timeouts (e.g. for finding images) via WebDriver
driver.manage().timeouts().implicitlyWait(60, TimeUnit.SECONDS);
//there could be other examples, the above is just one, probably the primary one

//not sure whether there is any use or feasibility in "findElements"
with Sikuli, but we could support findElement

driver.findElement(By.name("WindowsStartMenu.png")).click(); //finds
image from default repository location

driver.findElement(By.xpath("C:\\Test\\RunDialogTextField.png")).sendKeys("notepad");

driver.findElement(By.cssSelector("base64encodedImageStringHere")).click();

I would assume a user of the Sikuli (Java) API could easily infer what
the above WebDriver commands should translate to in terms of the Sikuli
API equivalent calls. Though for the case of the base64 image, extra
processing is needed to convert it to a binary image stored in temp
directory before passing to Sikuli command.

There could be more use cases beyond click() and sendKeys() and setting
timeouts but I've not used enough of Sikuli nor have I thought it out in
detail on what WebDriver APIs to support and what not, but this is a
good starting reference for a proof of concept. No?

In terms of remote Sikuli execution, it is already remote when using
WebDriver. The user instantiates a RemoteWebDriver client instance which
connects to the Sikuli-aware Selenium server which listens for client
requests, proxying it to/from Sikuli (locally on the machine that runs
the Sikuli-aware Selenium server). In such a way, it can be used
remotely or locally (localhost), and in Selenium Grid deployments.

We're basically building the Sikuli-aware Selenium server, replacing the
internal Selenium code that manipulates browsers once the WebDriver
command is decoded with Sikuli commands instead. This work could be
based on modifying the Selenium server codebase or perhaps other
WebDriver API based servers like Appium or ios-driver, whichever is
easier and better to do.

-- 
You received this question notification because you are a member of
Sikuli Drivers, which is an answer contact for Sikuli.