David Z David Z - 2 months ago 13
R Question

unlist and merge selected columns to data frame in R

Suppose I have a list such like:

df1<-data.frame(n=letters[1:4], x=1:4, y=2:5, z=3:6)
df2<-data.frame(n=letters[2:5], x=2:5, y=3:6, z=4:7)
df3<-data.frame(n=letters[3:7], x=2:6, y=3:7, z=4:8)
ls<-list(df1, df2, df3)
ls
[[1]]
n x y z
1 a 1 2 3
2 b 2 3 4
3 c 3 4 5
4 d 4 5 6

[[2]]
n x y z
1 b 2 3 4
2 c 3 4 5
3 d 4 5 6
4 e 5 6 7

[[3]]
n x y z
1 c 2 3 4
2 d 3 4 5
3 e 4 5 6
4 f 5 6 7
5 g 6 7 8


what I wanted is to merger the first two columns of each data frame in the list by column
n
and a desired output would be:

n x1 x2 x3
1 a 1 NA NA
2 b 2 2 NA
3 c 3 3 2
4 d 4 4 3
5 e NA 5 4
6 f NA NA 5
7 g NA NA 6


And same thing for y and z:

n y1 y2 y3
1 a 2 NA NA
2 b 3 3 NA
3 c 4 4 3
4 d 5 5 4
5 e NA 6 5
6 f NA NA 6
7 g NA NA 7

n z1 z2 z3
1 a 3 NA NA
2 b 4 4 NA
3 c 5 5 4
4 d 6 6 5
5 e NA 7 6
6 f NA NA 7
7 g NA NA 8

Answer

We get the unique column names from the list of data.frames except the 'n' ('nm1'), loop through those (lapply(nm1,...), subset the columns of each of the 'data.frame' in 'ls' (lapply(ls, function(x) ...), and use Reduce, with merge to merge the datasets in the list.

nm1 <- setdiff(unlist(lapply(ls, names)), "n")
lapply(nm1, function(nm)  setNames(Reduce(function(...)
    merge(..., all=TRUE, by = "n"), lapply(ls, 
                 function(x) x[c("n", nm)])), make.unique(c("n", rep(nm, length(nm1))))))
#[[1]]
#  n  x x.1 x.2
#1 a  1  NA  NA
#2 b  2   2  NA
#3 c  3   3   2
#4 d  4   4   3
#5 e NA   5   4
#6 f NA  NA   5
#7 g NA  NA   6

#[[2]]
#  n  y y.1 y.2
#1 a  2  NA  NA
#2 b  3   3  NA
#3 c  4   4   3
#4 d  5   5   4
#5 e NA   6   5
#6 f NA  NA   6
#7 g NA  NA   7

#[[3]]
#  n  z z.1 z.2
#1 a  3  NA  NA
#2 b  4   4  NA
#3 c  5   5   4
#4 d  6   6   5
#5 e NA   7   6
#6 f NA  NA   7
#7 g NA  NA   8

NOTE: ls is a function name that lists the objects. It is better to avoid naming objects with known R functions.