marianess - 7 months ago 32

R Question

Hei, I learn R and I try to count how many zeros I have within the melted data. So, I want to know how many zeros corresponds to column a and b and print two results out.

I generated an example:

`library(reshape)`

library(plyr)

library(dplyr)

id = c(1,2,3,4,5,6,7,8,9,10)

b = c(0,0,5,6,3,7,2,8,1,8)

c = c(0,4,9,87,0,87,0,4,5,0)

test = data.frame(id,b,c)

test_melt = melt(test, id.vars = "id")

test_melt

I imagine for that I should create an if statement. Something with

if (test$value == 0){print()}, but how can I tell R to count zeros for a columns that have been melted?

Answer

With your data:

```
test_melt %>%
group_by(variable) %>%
summarize(zeroes = sum(value == 0))
# # A tibble: 2 x 2
# variable zeroes
# <fctr> <int>
# 1 b 2
# 2 c 4
```

Base R:

```
aggregate(test_melt$value, by = list(variable = test_melt$variable),
FUN = function(x) sum(x == 0))
# variable x
# 1 b 2
# 2 c 4
```

... and for curiosity:

```
library(microbenchmark)
microbenchmark(
dplyr = group_by(test_melt, variable) %>% summarize(zeroes = sum(value == 0)),
base1 = aggregate(test_melt$value, by = list(variable = test_melt$variable), FUN = function(x) sum(x == 0)),
# @PankajKaundal's suggested "formula" notation reads easier
base2 = aggregate(value ~ variable, test_melt, function(x) sum(x == 0))
)
# Unit: microseconds
# expr min lq mean median uq max neval
# dplyr 916.421 986.985 1069.7000 1022.1760 1094.7460 2272.636 100
# base1 647.658 682.302 783.2065 715.3045 765.9940 1905.411 100
# base2 813.219 867.737 950.3247 897.0930 959.8175 2017.001 100
```