marianess - 1 year ago 66
R Question

# Counting amount of zeros within a "melted" data frame

Hei, I learn R and I try to count how many zeros I have within the melted data. So, I want to know how many zeros corresponds to column a and b and print two results out.
I generated an example:

``````library(reshape)
library(plyr)
library(dplyr)
id = c(1,2,3,4,5,6,7,8,9,10)
b = c(0,0,5,6,3,7,2,8,1,8)
c = c(0,4,9,87,0,87,0,4,5,0)
test = data.frame(id,b,c)
test_melt = melt(test, id.vars = "id")
test_melt
``````

I imagine for that I should create an if statement. Something with
if (test\$value == 0){print()}, but how can I tell R to count zeros for a columns that have been melted?

``````test_melt %>%
group_by(variable) %>%
summarize(zeroes = sum(value == 0))
# # A tibble: 2 x 2
#   variable zeroes
#     <fctr>  <int>
# 1        b      2
# 2        c      4
``````

Base R:

``````aggregate(test_melt\$value, by = list(variable = test_melt\$variable),
FUN = function(x) sum(x == 0))
#   variable x
# 1        b 2
# 2        c 4
``````

... and for curiosity:

``````library(microbenchmark)
microbenchmark(
dplyr = group_by(test_melt, variable) %>% summarize(zeroes = sum(value == 0)),
base1 = aggregate(test_melt\$value, by = list(variable = test_melt\$variable), FUN = function(x) sum(x == 0)),
# @PankajKaundal's suggested "formula" notation reads easier
base2 = aggregate(value ~ variable, test_melt, function(x) sum(x == 0))
)
# Unit: microseconds
#   expr     min      lq      mean    median        uq      max neval
#  dplyr 916.421 986.985 1069.7000 1022.1760 1094.7460 2272.636   100
#  base1 647.658 682.302  783.2065  715.3045  765.9940 1905.411   100
#  base2 813.219 867.737  950.3247  897.0930  959.8175 2017.001   100
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download