ds_user ds_user - 22 days ago 6
R Question

Count on each group matching the criteria R

I have the below dataset, I am trying to find the number of days each machine was active and in sleep mode based on SwitchedOnDate.

MachineID InstalledDate SwitchedOnDate Status
1 2010-02-18 2010-02-19 SleepMode
1 2010-02-18 2010-02-20 Active
1 2010-02-18 2010-02-21 SleepMode
1 2010-02-18 2010-02-22 Active
2 2010-02-20 2010-02-21 Active
2 2010-02-20 2010-02-22 SleepMode
3 2010-02-10 2010-02-18 SleepMode
4 2010-03-10 2010-03-15 Active


So my output should look something like this,

MachineID SleepModeDays ActiveDays
1 2 2
2 1 1
3 1 0
4 0 1

Answer

You can use count to group and count occurrences of each level of the last variable you pass it, and then spread to rearrange the data to wide form:

library(tidyverse)

df %>% count(MachineID, Status) %>% spread(Status, n, fill = 0L)

## Source: local data frame [4 x 3]
## Groups: MachineID [4]
## 
##   MachineID Active SleepMode
## *     <int>  <int>     <int>
## 1         1      2         2
## 2         2      1         1
## 3         3      0         1
## 4         4      1         0

A base R alternative:

tab <- table(df$MachineID, df$Status)

data.frame(MachineID = rownames(tab), as.data.frame.matrix(tab))

##   MachineID Active SleepMode
## 1         1      2         2
## 2         2      1         1
## 3         3      0         1
## 4         4      1         0

or for a very direct route,

janitor::crosstab(df, MachineID, Status)

##   MachineID Active SleepMode
## 1         1      2         2
## 2         2      1         1
## 3         3      0         1
## 4         4      1         0

Note these approachs makes a few assumptions, e.g. that you can't have multiple occurrences of the same combination of MachineID, SwitchedOnDate, and Status. If your data gets more complicated, account for such.