Tsuyoshi Endo - 3 months ago 56
R Question

# efficient method how to realize sumif and countif in R

When I realize countif and sumif by R,
I always use sapply-function and table-function like this:

``````symbol = letters[sample(1:3, 5, replace=TRUE)]
df=data.frame(a=symbol,
b=seq_len(length(symbol)))

#sumif
summary=data.frame(key=unique(df\$a))
summary\$sum=sapply(
seq_len(nrow(summary)),
function(i) with(df, sum(df\$b[a==summary\$key[i]]))
)

#countif
countif = data.frame(
key=names(table(df\$a)),
count=as.vector(table(df\$a))
)

summary = merge(
summary,
countif,
c("key")
)
``````

Is there any efficient method?

We can use `data.table` for efficiency. Convert the 'data.frame' to 'data.table' (`setDT(df)`), grouped by 'a', we get the `sum` of 'b' and the number of elements (`.N`).

``````library(data.table)
setDT(df)[, .(sum = sum(b), count = .N), .(key = a)]
#    key sum count
#1:   c   1     1
#2:   a   6     2
#3:   b   8     2
``````

Or another option is `dplyr`

``````library(dplyr)
df %>%
group_by(key = a) %>%
summarise(sum = sum(b), count = .N)
``````