Gilian Ponte Gilian Ponte - 1 month ago 5
R Question

I want to loop over a dataframe containing urls using rvest in r

First i scrape a certain amount of urls from a website and collect them into a dataframe. However i want to loop over the urls which i collected into the dataframe. This is my code:

library(rvest)library(dplyr)
library(XLConnect)
##########GET URLS###################################################################################
urls <- read_html("http://www.klassiekshop.nl/labels/labels-a-e/brilliant-classics/?limit=all")

urls <- urls %>%
html_nodes(".product-name a") %>%
html_attr("href") %>%
as.character()

url <- as.data.frame(urls)
as.character(url$urls)


#########EXTRACT URLS FROM DATAFRAME URLS############################################################
#########CREATE DATAFRAME############################################################################
EAN <- 0
price <- 0

df <- data.frame(EAN, price)

#########GET DATA####################################################################################
pricing_data <- for(i in urls){

site <-read_html(i)
print(i)
stats <- data.frame(EAN =site %>% html_node("b") %>% html_text() ,
price =site %>% html_node(".price") %>% html_text() ,
stringsAsFactors=FALSE)
data <-rbind(df,stats)
}


When debugging the loop runs over the urls. However it doesn't collect the data. Does anyone know how to get the data from the site?

Thanks!

Answer

It's because you're rbinding df to stats... but you never change df... I think you want to change the last line of your code to: df <-rbind(df,stats)