boop boop - 23 days ago 7
R Question

For loop compilation error in R

I have a data frame in R that lists monthly sales data by department for a store. Each record contains a month/year, a department name, and the total sales in that department for the month. I'm trying to calculate the mean sales by department, adding them to the vector

avgs
but I seem to be having two problems: the total sales per department is not compiling at all (its evaluating to zero) and
avgs
is compiling by record instead of by department. Here's what I have:

avgs = c()
for(dept in data$departmentName){
total <- 0
for(record in data){
if(identical(data$departmentName, dept)){
total <- total + data$ownerSales[record]
}
}
avgs <- c(avgs, total/72)
}


Upon looking at
avgs
on completion of the loop, I find that it's returning a vector of zeroes the length of the data frame rather than a vector of 22 averages (there are 22 departments). I've been tweaking this forever and I'm sure it's a stupid mistake, but I can't figure out what it is. Any help would be appreciated.

Answer

why not use library(dplyr)?:

library(dplyr)
data(iris)

iris %>% group_by(Species) %>% # or dept
    summarise(total_plength = sum(Petal.Length), # total owner sales
              weird_divby72 = total_plength/72) # total/72?
# A tibble: 3 × 3
     Species total_plength weird_divby72
      <fctr>         <dbl>         <dbl>
1     setosa          73.1      1.015278
2 versicolor         213.0      2.958333
3  virginica         277.6      3.855556

your case would probably look like this :

data %>% group_by(deptName) %>%
    summarise(total_sales = sum(ownerSales),
              monthly_sales = total_sales/72)

I like dplyr for it's syntax and pipeability. I think it is a huge improvement over base R for ease of data wrangling. Here is a good cheat sheet to help you get rolling: https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf

Comments