Elena Elena - 2 months ago 7
R Question

Group, Summarize and transpose in one step

I have a dataframe that looks like this

Vehicle Model Month Sales
A XXY 1 10
A XXY 1 100
A XXY 2 40
A XXY 3 10
A YYX 3 10
A YYX 3 33
B ZZZ 1 50
B ZZY 2 60


I want to be able to transform it as follows:

Vehicle Model 1 2 3 4 5
A XXY 2 1 1 0 0
A YYX 0 0 2 0 0
B ZZZ 1 0 0 0 0
B ZZY 0 1 0 0 0


So essentially I want to - Group by 2 fields( Vehicle, Model) and then Count the number of record in the column "Sales" BY Month, and transpose the data so that Month becomes column and Vehicle/Model my rows. Also some models might not have up to 12 months, but I want to display all the columns 1 to 12, leaving 0 if no data available

The dataframe is quite large. Any recommendation? Thanks

Answer

Not sure about the "one step" part of the question. Doing it in several steps doesn't work?

library(data.table)
s <- "      Vehicle Model Month Sales
      A   XXY     1    10
      A   XXY     1   100
      A   XXY     2    40
      A   XXY     3    10
      A   YYX     3    10
      A   YYX     3    33
      B   ZZZ     1    50
      B   ZZY     2    60"
dt <- fread(s)

dt[, sale_count := .N, by = .(Vehicle, Model, Month)]
dt[, Sales := NULL]
dt <- unique(dt)
dcast(dt, Vehicle + Model ~ Month, value.var = "sale_count")

result: there is NA instead of 0 for months not available. You can change it to 0 if you want.

   Vehicle Model  1  2  3
1:       A   XXY  2  1  1
2:       A   YYX NA NA  2
3:       B   ZZY NA  1 NA
4:       B   ZZZ  1 NA NA
Comments