alaj alaj - 3 months ago 35
R Question

Intersecting a list of data frames in R

Is there a function in R that intersects a list of multiple data frames with different number of columns and returns a list of multiple data frames having matched columns?

As an example I have the following list:

ll <- list(structure(list(V1 = c(8L, 2L, 7L), V2 = c(1L, 9L, 3L), V3 = 4:6), .Names = c("V1", "V2", "V3"), row.names = c(NA, -3L), class = "data.frame"), structure(list(V1 = c(1L, 3L, 2L), V2 = c(5L, 4L, 6L)), .Names = c("V1", "V2"), row.names = c(NA, -3L), class = "data.frame"))

> ll
[[1]]
V1 V2 V3
1 8 1 4
2 2 9 5
3 7 3 6

[[2]]
V1 V2
1 1 5
2 3 4
3 2 6


The resulting list should give:

> new.ll
[[1]]
V1 V2
1 8 1
2 2 9
3 7 3

[[2]]
V1 V2
1 1 5
2 3 4
3 2 6


Thanks.

Answer

There should be a better alternative for this. However, right now I can think of only this.

mincol <- Reduce(intersect, lapply(ll, colnames))
lapply(ll, function(x) x[mincol])

#[[1]]
#  V1 V2
#1  8  1
#2  2  9
#3  7  3

#[[2]]
#  V1 V2
#1  1  5
#2  3  4
#3  2  6

Finding out the common column names using intersect and then selecting only those column names across all the dataframes in the list.