Jean-Philippe Fontaine Jean-Philippe Fontaine - 12 days ago 7
R Question

reshaping a large dataframe in R

I have a dataframe of 500 rows and 4004 columns that I would like to reshape to a dataframe of 500500 rows and 4 columns.
That is from this dataframe:
V1 V2 V3 V4 ... V4001 V4002 V4003 V4004
1 2 3 4 ... 4001 4002 4003 4004

1 2 3 4 ... 4001 4002 4003 4004

1 2 3 4 ... 4001 4002 4003 4004

... ... ... ... ... ... ... ... ... ... ... ... ...

1 2 3 4 ... 4001 4002 4003 4004

I would like :

V1 V2 V3 V4

1 2 3 4

1 2 3 4

1 2 3 4

1 2 3 4

... ... ... ... ... ... ... ... ...

4001 4002 4003 4004

4001 4002 4003 4004

4001 4002 4003 4004

... ... ... ... ...

4001 4002 4003 4004

I tried already to use y=matrix(as.matrix(dataGaus[[1]]),500500,4) (where dataGaus is my dataframe) but it doesn't give the expected result.
I tried also to use reshape but I can't manage to use it to reproduce the result (and I have been through lot of posts on StackOverflow and on the net).
In python, we can do this with a simple command numpy.array(dataGaus).reshape(-1,4). For some reasons, I am doing my analysis in R, and I would like to know if there is
a function which does the same thing as the reshape(-1,4) of numpy in Python?

Thanks in advance, best

Answer

So if someone see this post, and wonders what is the answer, here is the answer that I got from R mailing list (thanks to David L Carson) :

rows<-500
cols<-4004
dat2 <- array(as.matrix(dataGaus[[1]]), dim=c(rows, 4, cols/4))
dat3 <- as.data.frame(matrix(aperm(dat2, c(1, 3, 2)), rows*cols/4, 4))

where dataGaus[[1]] is the dataframe that I read from my datas usinf read.csv. The trick here is the use of aperm to create a permutation vector c(1,3,2). I am still not sure about how does it work, but for my purpose this works perfectly.