Susu Susu - 3 months ago 15
R Question

Adding a new column to a list of dataframes, but only select one line of a vector

I have a list of data frames and I'm trying to add a corresponding element of a vector as value for a new variable across all observations to each data frame of the list.

My question would be: Is there some kind of "index" that keeps track in which step the lapply function is right now? I wasn't able to find anything on this, but it would solve my problem I believe because I wouldn't need a loop and could just use

timevar <- time[magic.index]
I think the example makes a lot clearer what I mean.

#list of data frames
df1 <- data.frame("Var1" = c(1:10))
df2 <- data.frame("Var1" = c(1:10),"Var2" = c(1:10))
df3 <- data.frame("Var1" = c(1:10),"Var2" = c(1:10),"Var3" = c(1:10))
dfs <- list(df1,df3,df2)

time <- c(1,2,1)

#this is what I want to do with lapply
lapply(dfs, function(x) within(x, timevar <- 1))

dfs2 <- for (i in seq_along(dfs)){
lapply(dfs, function(x) within(x, timevar <- time))
}

#this is what the result should look like
dfs[[1]] <- within(dfs[[1]], timevar <- 1)
dfs[[2]] <- within(dfs[[2]], timevar <- 2)
dfs[[3]] <- within(dfs[[3]], timevar <- 1)
dfs

Answer

We can use Map to create a 'timevar' column by cbinding the corresponding list element in 'dfs' with the element in 'time' vector.

Map(cbind, dfs, timevar = time)
#[[1]]
#   Var1 timevar
#1     1       1
#2     2       1
#3     3       1
#4     4       1
#5     5       1
#6     6       1
#7     7       1
#8     8       1
#9     9       1
#10   10       1

#[[2]]
#   Var1 Var2 Var3 timevar
#1     1    1    1       2
#2     2    2    2       2
#3     3    3    3       2
#4     4    4    4       2
#5     5    5    5       2
#6     6    6    6       2
#7     7    7    7       2
#8     8    8    8       2
#9     9    9    9       2
#10   10   10   10       2

#[[3]]
#   Var1 Var2 timevar
#1     1    1       1
#2     2    2       1
#3     3    3       1
#4     4    4       1
#5     5    5       1
#6     6    6       1
#7     7    7       1
#8     8    8       1
#9     9    9       1
#10   10   10       1

If we are using hadleyverse , the map2 from purrr can be useful as well

library(purrr)
dfs %>% 
    map2(time, ~cbind(.x, timevar=.y))