Liam Flynn Liam Flynn - 8 months ago 38
Javascript Question

Scraping javascript website in R

I want to scrape the match time and date from this url:

By using the chrome dev tools, I can see this appears to be generated using the following code:

<td colspan="3" id="utime" class="mstat-date">01:20 AM, October 29, 2014</td>

But this is not in the source html.

I think this is because its java (correct me if Im wrong). How can I scrape this information using R?


So, RSelenium is not the only answer (anymore). If you can install the PhantomJS binary (grab phantomjs binaries from here: then you can use it to render the HTML and scrape it with rvest (similar to the RSelenium approach but doesn't require java):


# render HTML from the site with phantomjs

url <- ""

writeLines(sprintf("var page = require('webpage').create();'%s', function () {
    console.log(page.content); //page source
});", url), con="scrape.js")

system("phantomjs scrape.js > scrape.html")

# extract the content you need
pg <- html("scrape.html")
pg %>% html_nodes("#utime") %>% html_text()

## [1] "10:20 AM, October 28, 2014"