MPhD MPhD - 2 months ago 14
R Question

How to merge subsets of data frames from a list (i.e., merge all of the first dfs from each list component)

I have seen a number of answers as to how to merge dataframes from a list when each list element is a single data frame. However, in my case, each list element contains two data frames. I want to merge all of the first together and all of the second. As a dummy example:

lst<-list()
lst[[1]]<-list(data.frame(cat=c(1:5), type=c(11:15)), data.frame(group=c("A","B","C"), num=c(1:3)))
lst[[2]]<-list(data.frame(cat=c(22:26), type=c(50:54)), data.frame(group=c("H","I","J"), num=c(7:9)))


I want to merge the first elements together and the second elements together, to yield two data frames:

df1:
cat type
1 1 11
2 2 12
3 3 13
4 4 14
5 5 15
6 22 50
7 23 51
8 24 52
9 25 53
10 26 54

df2:
group num
1 A 1
2 B 2
3 C 3
4 H 7
5 I 8
6 J 9


I am sure there is some straightforward way to do this (somehow with do.call and rbind??) but I cannot figure out how to reference the various elements within each list properly.

Clearly with this small example I could just do it manually by:

df1<-rbind(lst[[1]][[1]], lst[[2]][[1]])


However, my actual list includes hundreds of data frames. I can do it by creating a loop and rbinding in one at a time sequentially, but I'm sure there is a more efficient way...Thanks for any help!

Answer

You can use Reduce function(where you can customize how to reduce it) to rbind data frames. Reduce takes two elements from the list every time and reduce it to one element based on your function, and for the customized rbind since each two data frames need to be bound separately, you can use Map, put them together:

Reduce(function(x, y) Map(rbind, x, y), lst)

# [[1]]
#    cat type
# 1    1   11
# 2    2   12
# 3    3   13
# 4    4   14
# 5    5   15
# 6   22   50
# 7   23   51
# 8   24   52
# 9   25   53
# 10  26   54

# [[2]]
#   group num
# 1     A   1
# 2     B   2
# 3     C   3
# 4     H   7
# 5     I   8
# 6     J   9

Or maybe a faster way:

lapply(1:2, function(i) do.call(rbind, lapply(lst, `[[`, i)))