Daniel Stekhoven Daniel Stekhoven - 3 months ago 25
R Question

Gather multiple date/value columns using tidyr

I have a data set containing (amongst others) multiple columns with dates and corresponding values (repeated measurements). Is there a way to turn this into a long data set containing (the others and) only two columns - one for dates and one for values - using

tidyr
?

The following code produces an example data frame:

df <- data.frame(
id = 1:10,
age = sample(100, 10),
date1 = as.Date('2015-09-22') - sample(100, 10),
value1 = sample(100, 10),
date2 = as.Date('2015-09-22') - sample(100, 10),
value2 = sample(100, 10),
date3 = as.Date('2015-09-22') - sample(100, 10),
value3 = sample(100, 10))


The input table could (chance of
1
in
1.8x10^138
) look like this:

id age date1 value1 date2 value2 date3 value3
1 1 32 2015-08-01 37 2015-07-15 38 2015-09-09 81
2 2 33 2015-07-22 16 2015-06-26 1 2015-09-12 58
...
10 10 64 2015-07-23 78 2015-08-25 70 2015-08-05 90


What I finally want is this:

id age date value
1 1 32 2015-08-01 37
2 1 32 2015-07-15 38
3 1 32 2015-09-09 81
4 2 33 2015-07-22 16
5 2 33 2015-06-26 1
...
30 10 64 2015-08-05 90


Any help doing this in
tidyr
or
reshape
would be greatly appreciated.

Answer

There should be some efficient way, but this is one way.

Working separately for date and value,

#for date
df.date<-df%>%select(id, age,date1,date2, date3)%>%melt(id.var=c("id", "age"), value.name="date")
#for val
df.val<-df%>%select(id, age,value1,value2, value3)%>%melt(id.var=c("id", "age"), value.name="value")

Now join,

df2<-full_join(df.date, df.val, by=c("id", "age"))
df2%>%select(-variable.x, -variable.y)

 id age       date value
1   1  40 2015-07-19    28
2   1  40 2015-07-19    49
3   1  40 2015-07-19    24
4   2  33 2015-06-27    99
5   2  33 2015-06-27    18
6   2  33 2015-06-27    26
7   3  75 2015-07-07    63
8   3  75 2015-07-07    74
9   3  75 2015-07-07    72