marianess marianess - 2 months ago 13
R Question

Counting amount of zeros within a "melted" data frame

Hei, I learn R and I try to count how many zeros I have within the melted data. So, I want to know how many zeros corresponds to column a and b and print two results out.
I generated an example:

library(reshape)
library(plyr)
library(dplyr)
id = c(1,2,3,4,5,6,7,8,9,10)
b = c(0,0,5,6,3,7,2,8,1,8)
c = c(0,4,9,87,0,87,0,4,5,0)
test = data.frame(id,b,c)
test_melt = melt(test, id.vars = "id")
test_melt


I imagine for that I should create an if statement. Something with
if (test$value == 0){print()}, but how can I tell R to count zeros for a columns that have been melted?

Answer

With your data:

test_melt %>%
  group_by(variable) %>%
  summarize(zeroes = sum(value == 0))
# # A tibble: 2 x 2
#   variable zeroes
#     <fctr>  <int>
# 1        b      2
# 2        c      4

Base R:

aggregate(test_melt$value, by = list(variable = test_melt$variable),
          FUN = function(x) sum(x == 0))
#   variable x
# 1        b 2
# 2        c 4

... and for curiosity:

library(microbenchmark)
microbenchmark(
  dplyr = group_by(test_melt, variable) %>% summarize(zeroes = sum(value == 0)),
  base1 = aggregate(test_melt$value, by = list(variable = test_melt$variable), FUN = function(x) sum(x == 0)),
  # @PankajKaundal's suggested "formula" notation reads easier
  base2 = aggregate(value ~ variable, test_melt, function(x) sum(x == 0))
)
# Unit: microseconds
#   expr     min      lq      mean    median        uq      max neval
#  dplyr 916.421 986.985 1069.7000 1022.1760 1094.7460 2272.636   100
#  base1 647.658 682.302  783.2065  715.3045  765.9940 1905.411   100
#  base2 813.219 867.737  950.3247  897.0930  959.8175 2017.001   100
Comments