mplein mplein - 8 months ago 67
R Question

applying a function to a list of data frames in R

I'd like to apply acast () to a list of data frames and return a new list of matrices.

list<-list(df1 = data.frame(vegspecies=c("vsp1", "vsp1", "vsp1", "vsp2",
"vsp2", "vsp2"), species=c("sp1", "sp1", "sp1", "sp2", "sp2", "sp2"),
her=c(0,0,1,0,3,2)), df2 = data.frame(vegspecies="", species="", her=""))

For a single data frame the function works fine.

acast(list$df1, vegspecies ~ species, fill=0)

Given that my list also contains empty data frames I used tryCatch() to ignore the error and retrieve a NULL element. This also seems to work fine.

tryCatch(acast(list$df2, vegspecies ~ species, fill=0), error=function(e) print(NULL))

But I am unable to apply this over the whole list of data frames. Ideally the output should be a list of matrices (those matrices will have different sizes). I think the error is the way how I create the empty list in the first place, but I couldn't fix it.

wideData <- list(df1 = matrix(), df2 = matrix())

for(i in 1:length(list)){wideData[i] <- tryCatch(acast(list[[i]], vegspecies ~ species, fill=0), error=function(e) print(NULL))}


We can use lapply to loop over the list of 'data.frames and apply theacast` (assuming that 'df2' is also a similar dataset as 'df1' with same column names, types etc.)

res <- lapply(lst, function(x) acast(x, vegspecies ~ species, fill=0))

NOTE: It is better not to name a list object "list" (similarly a vector as "vector" or data.frame as "data.frame"). It can be any other name.

However, we can still use a single acast on a single data.frame by rbinding the lst object with a 'id' column to identify the list element

dt <- rbindlist(lst, idcol="grp")
dcast(dt, grp + vegspecies ~ species, fill = 0)