user85727 user85727 -4 years ago 91
R Question

using dplyr() to count, having an issue

This is probably a brain fart on my part but I'd like some help.

I have a data frame:

dftest <- data.frame(
"id" = c(rep("A",5),rep("B",5),rep("C",5)),
"time" = c(0,1,2,3,4,0,1,2,3,4,0,1,2,3,4),
"val" = c(1,2,2,2,2,1,2,2,2,2,2,1,1,1,1))


I'm trying to use the data frame to find, for each time, the number of times the val column equals 2 divided by the total number of entries at that time.

So for the above data frame, for time = 0, val = 2 for id = "C", so the result would be 1/3, whereas for time 1, val = 2 for id="A" and id="B", so the result would be 2/3.

How can I do this in dplyr?

Answer Source

You can find proportions using the mean() function on a boolean value (which is coerced to 0/1). For example

dftest %>% group_by(time) %>% 
    summarize(proptwo = mean(val==2))
#   A tibble: 5 × 2
#    time   proptwo
#   <dbl>     <dbl>
# 1     0 0.3333333
# 2     1 0.6666667
# 3     2 0.6666667
# 4     3 0.6666667
# 5     4 0.6666667
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download