Daniel - 1 year ago 58
R Question

# Counting 0`s, 1`s, 99`s and NA`s for each variable in a data frame

I have a data frame with 118 variables with

`0's`
,
`1's`
`99's`
and
`NA's`
. I need count for each variable how many
`99's`
,
`NA's`
,
`1's`
and
`0's`
there is (the
`99`
is "not apply", the
`0`
is "no", the
`1`
is "yes" and the
`NA`
is "No answer"). I try to do this with
`table`
function but it works with vectors, how can I do it for all the set of variables?

There is a little reproducible example of the data frame:

``````forest<-c(1,1,1,1,0,0,0,1,1,1,0,NA,0,NA,0,99,99,1,0,NA)
water<-c(1,NA,NA,NA,NA,99,99,0,0,0,1,1,1,0,0,NA,NA,99,1,0)
rain<-c(1,NA,1,0,1,99,99,0,1,0,1,0,1,0,0,NA,99,99,1,1)
fire<-c(1,0,0,0,1,99,99,NA,NA,NA,1,0,1,0,0,NA,99,99,1,1)

df<-data.frame(forest,water,rain,fire)
``````

And I need write in a data frame the result for variable, like this:

``````    forest    water    rain    fire
1    8         5        8       6
0    7         6        6       6
99   2         3        4       4
NA   3         6        2       4
``````

Can't find a good dupe, so here's my comment as an answer:

A data frame is really a list of columns. `lapply` will apply a function to every item in the input (every column, in the case of a data frame) and return a list with each result:

``````lapply(df, table)
# \$forest
#
#  0  1 99
#  7  8  2
#
# \$water
#
#  0  1 99
#  6  5  3
#
# \$rain
#
#  0  1 99
#  6  8  4
#
# \$fire
#
#  0  1 99
#  6  6  4
``````

`sapply` is like `lapply`, but it will attempt to simplify the result instead of always returning a `list`. In both cases, you can pass along additional arguments to the function being applied, like `useNA = "always"` to `table` to have `NA` included in the output:

``````sapply(df, table, useNA = "always")
#      forest water rain fire
# 0         7     6    6    6
# 1         8     5    8    6
# 99        2     3    4    4
# <NA>      3     6    2    4
``````

For lots more info, check out R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate

To compare with some other answers: `apply` is similar to `lapply` and `sapply`, but it is intended for use with matrices or higher-dimensional arrays. The only time you should use `apply` on a `data.frame` is when you need to apply a function to each row. For functions on data frame columns, prefer `lapply` or `sapply`. The reason is that `apply` will coerce the data frame to a `matrix` first, which can have unintended consequences if you have columns of different classes.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download