Canovice Canovice - 1 month ago 109
R Question

r - learning RSelenium, a few basic beginner technical issues

I've looked at https://github.com/ropensci/RSelenium/issues/94 and https://github.com/ropensci/RSelenium/issues/82 but was not able to solve my problem. It didn't help that this person was on Windows, and I am on Mac (El Capitan, version 10.11.6)

I am trying to learn data scraping with RSelenium, but some of the technical aspects of it are giving me issues early on. I have a few questions first and then will share my code:

(1) Right away, it says that startServer() is deprecated. specifically, that:

startServer()

# output
Warning message:
startServer is deprecated.
Users in future can find the function in
file.path(find.package("RSelenium"), "example/serverUtils").
The sourcing/starting of a Selenium Server is a users responsiblity.
Options include manually starting a server see
vignette("RSelenium-basics", package = "RSelenium")
and running a docker container see
vignette("RSelenium-docker", package = "RSelenium")


.
what should i use in place of startSever(), or what do I need to change on my computer? I'm confused as to what this warming message is saying.

(2) Since it's just a warning, I continue by trying to open a browser in chrome. I quickly run into another error:

remDr = remoteDriver$new(browserName = 'chrome')
remDr$open()

# output
[1] "Connecting to remote server"
$webdriver.remote.sessionid
[1] "4d0ad1d9-1c4b-4171-8dce-ba8363f5849e"

$locationContextEnabled
[1] TRUE

$webStorageEnabled
[1] TRUE

$takesScreenshot
[1] TRUE

$javascriptEnabled
[1] TRUE

$message
[1] "session not created exception\nfrom unknown error: Runtime.executionContextCreated has invalid 'context': {\"auxData\":{\"frameId\":\"34144.1\",\"isDefault\":true},\"id\":1,\"name\":\"\",\"origin\":\"://\"}\n (Session info: chrome=54.0.2840.71)\n (Driver info: chromedriver=2.20.353124 (035346203162d32c80f1dce587c8154a1efa0c3b),platform=Mac OS X 10.11.6 x86_64)"

$hasTouchScreen
[1] TRUE

$platform
[1] "ANY"

$cssSelectorsEnabled
[1] TRUE

$id
[1] "4d0ad1d9-1c4b-4171-8dce-ba8363f5849e"


the $message line output mentions that the session was not created. on my desktop, what i see is that chrome opens initially for a split second, and then closes / crashes / doesn't actually open up. I try again for firefox, and get:

remDr = remoteDriver$new(browserName = 'firefox')
remDr$open()

# output
[1] "Connecting to remote server"

Selenium message:The path to the driver executable must be set by the webdriver.gecko.driver system property; for more information, see https://github.com/mozilla/geckodriver. The latest version can be downloaded from https://github.com/mozilla/geckodriver/releases

Error: Summary: UnknownError
Detail: An unknown server-side error occurred while processing the command.
class: java.lang.IllegalStateException
Further Details: run errorDetails method


it is frustrating to try to learn this, but to not even be able to get past the very first steps of opening a browser. Any help is greatly appreciated!

Answer

As noted checkForServer and startServer are deprecated you may be able to use them as follows:

unlink(file.path(find.package("RSelenium"), "bin"), recursive = TRUE, force = TRUE)
RSelenium::checkForServer()

For Firefox:

In terminal, run the following command

brew install geckodriver

Running selenium at the default port on Mac has an issue as often Kerberos is already running on default port 4444 on MAC. Run the following command in R console

selServ <- RSelenium::startServer(args = c("-port 5556"))
remDr <- RSelenium::remoteDriver(extraCapabilities = list(marionette = TRUE), port=5556)
remDr$open()
......
# when finished
selServ$stop()

For chrome:

brew install chromedriver

Running selenium at the default port on Mac has an issue. Run the following command in R console

selServ <- RSelenium::startServer(args = c("-port 5556"))
remDr <- RSelenium::remoteDriver(browserName = "chrome", 
                                 extraCapabilities = list(marionette = TRUE),
                                 port=5556)
remDr$open()
......
# when finished
selServ$stop()

If the above doesnt help then look at running a Docker container see http://rpubs.com/johndharrison/RSelenium-Docker and https://github.com/SeleniumHQ/docker-selenium . This basically involves running a Docker container using something like:

$ docker run -d -p 5556:4444 selenium/standalone-chrome:3.0.1-aluminum

then a selenium server and chrome browser should be accessible on port 5556 which you can connect to giving appropriate arguments in remoteDriver.