MalteseUnderdog MalteseUnderdog - 2 months ago 7
R Question

Reshape three column data frame to matrix ("long" to "wide" format)

I have a

that looks like this.

x a 1
x b 2
x c 3
y a 3
y b 3
y c 2

I want this in matrix form so I can feed it to heatmap to make a plot. The result should look something like:

a b c
x 1 2 3
y 3 3 2

I have tried
from the reshape package and I have tried writing a manual function to do this but I do not seem to be able to get it right.


There are many ways to do this. This answer starts with my favorite ways, but also collects various ways from answers to similar questions scattered around this site.

tmp <- data.frame(x=gl(2,3, labels=letters[24:25]),
                  y=gl(3,1,6, labels=letters[1:3]), 

Using reshape2:

acast(tmp, x~y, value.var="z")

Using matrix indexing:

with(tmp, {
  out <- matrix(nrow=nlevels(x), ncol=nlevels(y),
                dimnames=list(levels(x), levels(y)))
  out[cbind(x, y)] <- z

Using xtabs:

xtabs(z~x+y, data=tmp)

You can also use reshape, as suggested here: Convert table into matrix by column names, though you have to do a little manipulation afterwards to remove an extra columns and get the names right (not shown).

> reshape(tmp, idvar="x", timevar="y", direction="wide")
  x z.a z.b z.c
1 x   1   2   3
4 y   3   3   2

There's also sparseMatrix within the Matrix package, as seen here: R - convert BIG table into matrix by column names

> with(tmp, sparseMatrix(i = as.numeric(x), j=as.numeric(y), x=z,
+                        dimnames=list(levels(x), levels(y))))
2 x 3 sparse Matrix of class "dgCMatrix"
  a b c
x 1 2 3
y 3 3 2

The daply function from the plyr library could also be used, as here:

> library(plyr)
> daply(tmp, .(x, y), function(x) x$z)
x   a b c
  x 1 2 3
  y 3 3 2

dcast from reshape2 also works, as here: Reshape data for values in one column, but you get a data.frame with a column for the x value.

> dcast(tmp, x~y, value.var="z")
  x a b c
1 x 1 2 3
2 y 3 3 2

Similarly, spread from "tidyr" would also work for such a transformation:

spread(tmp, y, z)
#   x a b c
# 1 x 1 2 3
# 2 y 3 3 2