user6794408 user6794408 - 4 months ago 11
R Question

How do I get statistics of column "RetailSales2014"?

So "RetailSales2014" contains money values. I know I need to remove the commas to perform statistical analysis, but do I also need to remove the leading '$' symbols too? If I do, how would I remove them?

# Load packages


Specify URL

url <- ""

Download the content of the URL

url_content <- getURL(url)

Parse the HTML/XML content to generate an R structure representing the HTML/XML tree

doc <- htmlParse(url_content)

tables <- readHTMLTable(doc)

Convert the 3rd element of the list to data frame

retailer_df <- data.frame(tables)


Rename retailer_df columns

colnames(retailer_df) <- c("Rank","Company","Headquarter","RetailSales2014","USASalesGrowth","WorldwideRetailSales","USAPercentageOfWorldwideSales","Stores2014","Growth")


Write the retailer data into csv file under the working directory

write.csv(retailer_df, file = "top100retailers2015.csv")

retailer_df$RetailSales2014 <- 
    as.numeric(gsub("(\\D)", "", retailer_df$RetailSales2014))