Android Beginner Android Beginner - 23 days ago 6
R Question

Sum the columns in a list of dataframes

I want to generate a single dataframe from the list of dataframes in which all columns are same except one column 'Income' which should be sum of the all the 'Income' in the list.

Here is my list of dataframe

mylist= structure(list(`1` = structure(list(ID = c(36L, 37L, 38L, 39L), Income = c(0, 0, 0, 9100)), .Names = c("ID", "Income"), row.names = c(1L, 2L, 3L, 4L), class = "data.frame"), `2` = structure(list(ID = c(36L, 37L, 38L, 39L), Income = c(0, 0, 0, 0)), .Names = c("ID", "Income"), row.names = c(1L, 2L, 3L, 4L), class = "data.frame"), `3` = structure(list(ID = c(36L, 37L, 38L, 39L), Income = c(7360, 0, 0, 0)), .Names = c("ID", "Income"), row.names = c(1L, 2L, 3L, 4L), class = "data.frame")))

> mylist
$`1`
ID Income
1 36 0
2 37 0
3 38 0
4 39 9100

$`2`
ID Income
1 36 0
2 37 0
3 38 0
4 39 0

$`3`
ID Income
1 36 7360
2 37 0
3 38 0
4 39 0


This is what I want to do :

ID Income
34 36 7360
26 37 0
23 38 0
15 39 9100


I have tried to use reduce() to do the sum but it creates a separate column which I want to avoid :

Reduce(function(df1, df2) data.frame(df1[,], res=df1["Income"] + df2["Income"]),mylist)

Answer

If the 'ID's in different list can be different, we merge the datasets using Reduce, and then do the rowSums of the output except the first one to create the 'Income' column.

r1 <- Reduce(function(...) merge(..., by = "ID"), mylist) 
data.frame(r1[1], Income = rowSums(r1[-1]))
#   ID Income
#1 36   7360
#2 37      0
#3 38      0
#4 39   9100

If the 'IDs' are the same and in the same order for all the datasets in the list, we create the data.frame by extracting the 'ID' from the first element of 'mylist' and get the sum of the 'Income' using Reduce with +.

data.frame(mylist[[1]][1], Reduce(`+`, lapply(mylist, `[`, 'Income')))