heesemonster heesemonster - 22 days ago 6
R Question

How to reshape data where values of grouped rows become columns? (long to wide?)

I've searched around for an adequate answer and I don't think that this question has been answered yet. Essentially, I have a data frame in long format that looks like this:

ID event_type event_value
123 A 1.1
123 A 1.2
123 A "Hello"
234 B "Hello"
456 A 2.8


Where there are multiple event types with a variety of values. What I want to do is to reshape the data such that it looks like this

ID event_type_A_1 event_type_A_2 event_type_A_3 event_type_B_1
123 1.1 1.2 "Hello" NA
234 NA NA NA "Hello"
456 2.8 NA NA NA


Such that the new columns extend out as far as the longest event type for any given patient, and the rest are filled with NA. I've mucked around with
spread()
and
cast()
, but for some reason it just isn't clicking. Thanks!

Answer

You can do this with dplyr and tidyr. The trick is using group_by and adding a within group index using mutate and row_number.

library(dplyr)
library(tidyr)

df <- data_frame(ID = c(123,123,123,234,456),
                 event_type = c("A","A","A","B","A"),
                 event_value = c(1.1, 1.2, "Hello", "Hello", 2.8))

df %>%
  group_by(ID) %>% 
  mutate(sub_ID = row_number()) %>% 
  unite("ID_type", event_type, sub_ID,remove = TRUE) %>% 
  spread(ID_type, event_value)

Breaking down the dplyr chain:

  1. Make groups by ID
  2. Create sub_ID with row_number()
  3. Combine event_type and sub_ID with unite and remove the original columns
  4. spread() the ID_type into columns with the values from event_value