JohnL_10 JohnL_10 - 1 year ago 56
R Question

Split dataframe based on one column in r, with a non-fixed width column

I have a problem that is an extension of a well-covered issue here on SE. I.e:

Split a column of a data frame to multiple columns

My data has a column with a string format, comma-separated, but of no fixed length.

data = data.frame(id = c(1,2,3), treatments = c("1,2,3", "2,3", "8,9,1,2,4"))


So I would like to have my dataframe eventually be in the proper tidy/long form of:

id treatments
1 1
1 2
1 3
...
3 1
3 2
3 4


Something like
separate
or
strsplit
doesn't seem on it's own to be the solution. Separate fails with warnings that various columns have too many values (NB id 3 has more values than id 1).

Thanks

Answer Source

You can use tidyr::separate_rows:

library(tidyr)
separate_rows(data, treatments)

#   id treatments
#1   1          1
#2   1          2
#3   1          3
#4   2          2
#5   2          3
#6   3          8
#7   3          9
#8   3          1
#9   3          2
#10  3          4
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download