Tsuyoshi Endo Tsuyoshi Endo - 1 year ago 144
R Question

efficient method how to realize sumif and countif in R

When I realize countif and sumif by R,
I always use sapply-function and table-function like this:

symbol = letters[sample(1:3, 5, replace=TRUE)]
df=data.frame(a=symbol,
b=seq_len(length(symbol)))


#sumif
summary=data.frame(key=unique(df$a))
summary$sum=sapply(
seq_len(nrow(summary)),
function(i) with(df, sum(df$b[a==summary$key[i]]))
)

#countif
countif = data.frame(
key=names(table(df$a)),
count=as.vector(table(df$a))
)

summary = merge(
summary,
countif,
c("key")
)


Is there any efficient method?

Answer Source

We can use data.table for efficiency. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'a', we get the sum of 'b' and the number of elements (.N).

library(data.table)
setDT(df)[, .(sum = sum(b), count = .N), .(key = a)]
#    key sum count
#1:   c   1     1
#2:   a   6     2
#3:   b   8     2

Or another option is dplyr

library(dplyr)
df %>%
   group_by(key = a) %>%
   summarise(sum = sum(b), count = .N)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download