Lisa Ann Lisa Ann - 3 months ago 6
R Question

To merge list's members with differing number of rows

Here is my list that you can run in your console (please, tell me if it's too long for example purposes, I can amend it):

my_list = list(structure(list(PX_LAST = c(0.398, 0.457, 0.4, 0.159, 0.126,
0.108, 0.26, 0.239, 0.222, 0.191, 0.184)), .Names = "PX_LAST", row.names = c("2014-04-28 00:00:00",
"2014-04-29 00:00:00", "2014-04-30 00:00:00", "2014-05-02 00:00:00",
"2014-05-05 00:00:00", "2014-05-06 00:00:00", "2014-05-07 00:00:00",
"2014-05-08 00:00:00", "2014-05-09 00:00:00", "2014-05-12 00:00:00",
"2014-05-13 00:00:00"), class = "data.frame"), structure(list(
PX_LAST = c(1.731, 1.706, 1.7095, 1.69, 1.713, 1.711, 1.724,
1.699, 1.702, 1.705, 1.649, 1.611)), .Names = "PX_LAST", row.names = c("2014-04-29 00:00:00",
"2014-04-30 00:00:00", "2014-05-01 00:00:00", "2014-05-02 00:00:00",
"2014-05-05 00:00:00", "2014-05-06 00:00:00", "2014-05-07 00:00:00",
"2014-05-08 00:00:00", "2014-05-09 00:00:00", "2014-05-12 00:00:00",
"2014-05-13 00:00:00", "2014-05-14 00:00:00"), class = "data.frame"),
structure(list(PX_LAST = c(0.481, 0.456, 0.448, 0.439, 0.436,
0.448, 0.458, 0.466, 0.432, 0.437, 0.441, 0.417, 0.4035)), .Names = "PX_LAST", row.names = c("2014-04-28 00:00:00",
"2014-04-29 00:00:00", "2014-04-30 00:00:00", "2014-05-01 00:00:00",
"2014-05-02 00:00:00", "2014-05-05 00:00:00", "2014-05-06 00:00:00",
"2014-05-07 00:00:00", "2014-05-08 00:00:00", "2014-05-09 00:00:00",
"2014-05-12 00:00:00", "2014-05-13 00:00:00", "2014-05-14 00:00:00"
), class = "data.frame"), structure(list(PX_LAST = c(1.65,
1.65, 1.64, 1.65, 1.662, 1.6595, 1.665, 1.6595, 1.6625, 1.652,
1.645, 1.6245, 1.627, 1.633)), .Names = "PX_LAST", row.names = c("2014-04-25 00:00:00",
"2014-04-28 00:00:00", "2014-04-29 00:00:00", "2014-04-30 00:00:00",
"2014-05-01 00:00:00", "2014-05-02 00:00:00", "2014-05-05 00:00:00",
"2014-05-06 00:00:00", "2014-05-07 00:00:00", "2014-05-08 00:00:00",
"2014-05-09 00:00:00", "2014-05-12 00:00:00", "2014-05-13 00:00:00",
"2014-05-14 00:00:00"), class = "data.frame"))


My question is: how can I use
do.call()
on that list to merge all the data according to their date?

Consider either
merge
and
cbind
return errors that I am not able to manage:

> do.call(what = merge, args = my_list)
Error in fix.by(by.x, x) :
'by' must specify column(s) as numbers, names or logical

> do.call(what = cbind, args = my_list)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 11, 12, 13, 14


I would like to get a single data matrix (whose possibly missing/not matching data are replaced by
NA
s) equal to the one I would get using
merge()
on the elements of
my_list
.

Answer

This would be a bit easier if you were not merging by row names, But you could do this with the Reduce function which will sequentially apply a function along a list of values (in this case data.frames`. Try

Reduce(function(x,y) {
    dd<-merge(x,y,by=0); rownames(dd)<-dd$Row.names; dd[-1]
}, my_list)

This will merge all matching rows. You can add all=T to the match if you like as well or customize how you would if you were using a regular merge().

You will get a warning about column names because each of your columns has an identical name so when you merge into multiple columns, merge doesn't know what you name them. You could rename them with something like

my_new_list <- Map(
    function(x,n) {
        names(x)<-n; x
    }, 
    my_list, 
    paste("PX_LAST",1:length(my_list), sep="_")
)

then

 Reduce(function(x,y) {
    dd<-merge(x,y,by=0); rownames(dd)<-dd$Row.names; dd[-1]
}, my_new_list)

won't complain.