R Question

Got an error message when using summarise inside for loop

Having a list containing 244 data frames. This list is called d, and

looks like this.


year pos days sal
1 2009 A 31 2000
2 2009 B 60 4000
3 2009 C 10 600
4 2010 B 10 1000
5 2010 D 90 7000

I would like to group data by year, adding days and sal, and select pos where days is maximum in the group.

The result is like:

year pos days sal
1 2009 B 101 6600
2 2010 D 100 8000

I know how to do this when it comes to the case doing it to only one data frame.
I did it this way:

summarise(ygroup, pos = pos[which.max(days)], days = sum(days), sal = sum(sal))

But I want to do this same operation to the 244 data frames in the list d.
I tried this:

for(i in 1:244){
e[[i]]<-summarise(ygroup[[i]], pos = pos[which.max(days)], days = sum(days), sal = sum(sal))

But this doesn't work, an error showing up.

Error: expecting a single value

(I think this part;
pos = pos[which.max(days)]
is making the problem, but I'm not that sure...)
How can I solve this...?

Any comments will be greatly appreciated! :)

Answer Source

We can use lapply with anonymous function call to loop over the list of data.frames ('d')

lapply(d, function(x) x %>% 
                       group_by(year) %>% 
                       summarise(pos = pos[which.max(days)], 
                                 days = sum(days), sal = sum(sal)))
