Eduardo Clark Eduardo Clark - 2 months ago 25
R Question

Rvest Could not find possible submission target when submitting form

I'm trying to scrape results from a site that requires a form submission, for this I'm using the rvest package.

The code fails after running the following commands:

require("rvest")
require(dplyr)
require(XML)

BasicURL <- "http://www.blm.mx/es/tiendas.php"
QForm <- html_form(read_html(BasicURL))[[1]]
Values <- set_values(QForm, txt_5 = 11850, drp_1="-1")
Session <- html_session(BasicURL)
submit_form(session = Session,form = Values)



Error: Could not find possible submission target.


I think it might be because rvest doesn't find the standard button targets for submitting. Is there away to specify to rvest which tags or buttons to look for?

Any help greatly appreciated

Answer

You can POST to the form directly with httr:

library(httr)
library(rvest)
library(purrr)
library(dplyr)

res <- POST("http://www.blm.mx/es/tiendas.php",
            body = list(txt_5 = "11850", 
                        drp_1 = "-1"), 
            encode = "form")

pg <- read_html(content(res, as="text", encoding="UTF-8"))

map(html_nodes(pg, xpath=".//div[@class='tiendas_resultado_right']"), ~html_nodes(., xpath=".//text()")) %>% 
  map_df(function(x) {
    map(seq(1, length(x), 2), ~paste0(x[.:(.+1)], collapse="")) %>% 
      trimws() %>% 
      as.list() %>% 
      setNames(sprintf("x%d", 1:length(.)))
  }) -> right

left <- html_nodes(pg, "div.tiendas_resultado_left") %>%  html_text()

df <- bind_cols(data_frame(x0=left), right)

glimpse(df)
## Observations: 7
## Variables: 6
## $ x0 <chr> "ABARROTES LA GUADALUPANA", "CASA ARIES", "COMERCIO RED QIUBO", "FERROCARRIL 4", "LA FLOR DE JALISCO", "LA MIGAJA", "VIA LACTEA"
## $ x1 <chr> "Calle IGNACIO ESTEVA", "Calle PARQUE LIRA", "Calle GENERAL JOSE MORAN No 74 LOCAL B", "Calle MELCHOR MUZQUIZ", "Calle MELCHOR M...
## $ x2 <chr> "Col. San Miguel Chapultepec I Sección", "Col. San Miguel Chapultepec I Sección", "Col. San Miguel Chapultepec I Sección", "Col....
## $ x3 <chr> "Municipio/Ciudad Miguel Hidalgo", "Municipio/Ciudad Miguel Hidalgo", "Municipio/Ciudad Miguel Hidalgo", "Municipio/Ciudad Migue...
## $ x4 <chr> "CP 11850", "CP 11850", "CP 11850", "CP 11850", "CP 11850", "CP 11850", "CP 11850"
## $ x5 <chr> "Estado Distrito Federal", "Estado Distrito Federal", "Estado Distrito Federal", "Estado Distrito Federal", "Estado Distrito Fed...