Nala Nala - 7 days ago 5
R Question

Optimizing too long loop in R

I'm quite new to R and really really need your help with my double for-loop which takes too much time to complete.

data
a data table with 659322 rows and 3 columns (ID, Game, Amount)

Each ID is repeated several times (i.e. several Game for each ID), but unevenly distributed across the rows. We may have 2 Games for ID1 (so ID1 appears in 2 rows), 5 Games for ID2, 4 Games for ID3, etc.

I want to create a matrix
datmat
from
data
with:

- Nb of rows = nb of unique values of ID (nb_row=46028)

- Nb of columns = nb of unique values of Game (nb_col=30)

and fill in
datmat
with the corresponding Amount values

Here's what I tried

ID <- unique(data$ID)
Game <- unique(data$Game)
nb_row <- length(ID)
nb_col <- length(Game)

datmat <- matrix(c(0),nb_row,nb_col,dimnames=list(NULL,Game))
for(i in 1:nb_row){
for(j in 1:nb_col){
datmat[i,j] <- data$Amount[data$ID==ID[i] & data$Game==Game[j]]
}
}
dt <- data.table(ID,datmat)


Any suggestion could be greatly appreciated. Thank you all!

Answer

You might want to use the reshape function :

newdata<-reshape(data,timevar="Game",idvar="ID",direction="wide")