Daniel Stekhoven Daniel Stekhoven - 1 year ago 74
R Question

Gather multiple date/value columns using tidyr

I have a data set containing (amongst others) multiple columns with dates and corresponding values (repeated measurements). Is there a way to turn this into a long data set containing (the others and) only two columns - one for dates and one for values - using

tidyr
?

The following code produces an example data frame:

df <- data.frame(
id = 1:10,
age = sample(100, 10),
date1 = as.Date('2015-09-22') - sample(100, 10),
value1 = sample(100, 10),
date2 = as.Date('2015-09-22') - sample(100, 10),
value2 = sample(100, 10),
date3 = as.Date('2015-09-22') - sample(100, 10),
value3 = sample(100, 10))


The input table could (chance of
1
in
1.8x10^138
) look like this:

id age date1 value1 date2 value2 date3 value3
1 1 32 2015-08-01 37 2015-07-15 38 2015-09-09 81
2 2 33 2015-07-22 16 2015-06-26 1 2015-09-12 58
...
10 10 64 2015-07-23 78 2015-08-25 70 2015-08-05 90


What I finally want is this:

id age date value
1 1 32 2015-08-01 37
2 1 32 2015-07-15 38
3 1 32 2015-09-09 81
4 2 33 2015-07-22 16
5 2 33 2015-06-26 1
...
30 10 64 2015-08-05 90


Any help doing this in
tidyr
or
reshape
would be greatly appreciated.

Answer Source

There should be some efficient way, but this is one way.

Working separately for date and value,

#for date
df.date<-df%>%select(id, age,date1,date2, date3)%>%melt(id.var=c("id", "age"), value.name="date")
#for val
df.val<-df%>%select(id, age,value1,value2, value3)%>%melt(id.var=c("id", "age"), value.name="value")

Now join,

df2<-full_join(df.date, df.val, by=c("id", "age"))
df2%>%select(-variable.x, -variable.y)

 id age       date value
1   1  40 2015-07-19    28
2   1  40 2015-07-19    49
3   1  40 2015-07-19    24
4   2  33 2015-06-27    99
5   2  33 2015-06-27    18
6   2  33 2015-06-27    26
7   3  75 2015-07-07    63
8   3  75 2015-07-07    74
9   3  75 2015-07-07    72