umm umm - 3 months ago 40
R Question

Web Scraping using Rvest on a Tennis table from Wiki

Here I am, a total beginner in R. I am trying to learn more about rvest and how to scrape from the web. Here is the wiki page (https://en.wikipedia.org/wiki/Andy_Murray) and below is the table I want to transfer to R.

table that i want

Using CSS Selector, I found that the particular table is on ".wikitable". Following some tutorials on other webpages, here is the code that I used:

library(rvest)
tennis <- read_html("https://en.wikipedia.org/wiki/Andy_Murray")
trial <- tennis %>% html_nodes(".wikitable") %>% html_table(fill = T)
trial


I could not isolate the result to the table that I wanted. Can someone please teach me how? An another thing, what does the pipe do (%>%)?

Answer

You were almost there. What you extracted was a list. To get to your desired element you need to use indexing:

trial[[2]]

To clean it further use:

df <- trial[[2]]
df <- df[-1,]
df[,17:20] <- NULL

enter image description here

%>% is called pipe from the magrittr/dplyr package. More info here.

Comments