Nancy Nancy - 2 months ago 14
R Question

Undo reshape with arbitrary number of columns created

I would like to undo a

reshape
after converting a long data frame to wide-format by generating numbered versions of single variables. The challenge I'm facing is doing this when there are multiple key variables and multiple variables that need to get re-combined. I've tried using
gather
from
tidyr
to no avail. Take this example of long data:

toy = data.frame(
first_key = rep(c("A", "B", "C"), each = 6),
second_key = rep(rep(c(1:2), each = 3), 3),
colors = c("red", "yellow", "green", "blue", "purple", "beige"),
days = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"),
index = c(1:3)
)


Which gives the following data.frame:

first_key second_key colors days index
A 1 red Monday 1
A 1 yellow Tuesday 2
A 1 green Wednesday 3
A 2 blue Thursday 1
A 2 purple Friday 2
A 2 beige Saturday 3
B 1 red Monday 1
B 1 yellow Tuesday 2
B 1 green Wednesday 3
B 2 blue Thursday 1
B 2 purple Friday 2
B 2 beige Saturday 3
C 1 red Monday 1
C 1 yellow Tuesday 2
C 1 green Wednesday 3
C 2 blue Thursday 1
C 2 purple Friday 2
C 2 beige Saturday 3


Reshaping it to a wide format with numbered versions of the variables looks like this:

toy_wide = reshape(toy, idvar = c("first_key", "second_key"),
timevar = "index", direction = "wide", sep = "_")


And gives this wide format:

first_key second_key colors_1 days_1 colors_2 days_2 colors_3 days_3
A 1 red Monday yellow Tuesday green Wednesday
A 2 blue Thursday purple Friday beige Saturday
B 1 red Monday yellow Tuesday green Wednesday
B 2 blue Thursday purple Friday beige Saturday
C 1 red Monday yellow Tuesday green Wednesday
C 2 blue Thursday purple Friday beige Saturday


But how do I get it back to the original format? I've tried the following but I get an error.

tidyr::gather(toy_wide, key = c("first_key", "second_key"), value = c("days", "colors"),
colors_1:days_3, factor_key = TRUE)



Error: Invalid column specification

Answer

Here is another option with melt from data.table which can take multiple measure patterns.

library(data.table)
melt(setDT(toy_wide), measure = patterns("^colors", "^days"), 
   value.name = c("colors", "days"), variable.name = "index")[order(first_key, second_key)]
Comments