user2579689 user2579689 - 2 months ago 22
R Question

dplyr tidyr to widen and sum specific columns

I'm struggling to use dplyr and tidyr to take a df in this form:

myDf <- data.frame(id = c(1,1,1,1,2,2),
event = c('a','b','a','b','a','b'),
a_property = c(1,NA,2, NA, 3, NA),
b_property = c(NA,2,NA, 3, NA, 4))

> myDf
id event a_property b_property
1 a 1 NA
1 b NA 2
1 a 2 NA
1 b NA 3
2 a 3 NA
2 b NA 4


and transform into this desired format:

id count_event_a count_event_b sum_property_a sum_property_b
1 2 2 3 5
2 1 1 5 4

Answer

A little more general:

myDf %>%
  gather(key, value, -id, -event) %>%
  filter(!is.na(value)) %>%
  group_by(id, event) %>%
  summarise(count = n(),
            sum = sum(value)) %>%
  gather(key, value, -id, -event) %>%
  unite(measure, key, event) %>%
  spread(measure, value)
Comments