min min - 1 month ago 7
R Question

Got an error message when using summarise inside for loop

Having a list containing 244 data frames. This list is called d, and

d[[1]]
looks like this.

d[[1]]

year pos days sal
1 2009 A 31 2000
2 2009 B 60 4000
3 2009 C 10 600
4 2010 B 10 1000
5 2010 D 90 7000


I would like to group data by year, adding days and sal, and select pos where days is maximum in the group.

The result is like:

year pos days sal
1 2009 B 101 6600
2 2010 D 100 8000


I know how to do this when it comes to the case doing it to only one data frame.
I did it this way:

library(dplyr)
ygroup<-group_by(d[[1]]$year)
summarise(ygroup, pos = pos[which.max(days)], days = sum(days), sal = sum(sal))


But I want to do this same operation to the 244 data frames in the list d.
I tried this:

e<-list()
ygroup<-list()
for(i in 1:244){
ygroup[[i]]<-group_by(d[[i]]$year)
e[[i]]<-summarise(ygroup[[i]], pos = pos[which.max(days)], days = sum(days), sal = sum(sal))
}


But this doesn't work, an error showing up.

Error: expecting a single value


(I think this part;
pos = pos[which.max(days)]
is making the problem, but I'm not that sure...)
How can I solve this...?

Any comments will be greatly appreciated! :)

Answer

We can use lapply with anonymous function call to loop over the list of data.frames ('d')

lapply(d, function(x) x %>% 
                       group_by(year) %>% 
                       summarise(pos = pos[which.max(days)], 
                                 days = sum(days), sal = sum(sal)))
Comments