javivr - 1 year ago 91
R Question

# Subsetting a table of two variables to remove the 0 values

I have a table such as the following sample, which is obtained from a df (0==control; 1==case; heading figures are strata):

``````   208  209 210 211 212 213
0   4   16  3   5   2   0
1   0   7   2   0   6   2
``````

I need to create a new df removing those strata with 0 cases (1) or controls (0).

So far, I created the following code which creates a vector with logicals:

``````table(df\$status, df\$strata)>0
``````

but havenĀ“t managed to go further.

We can use `subset`

``````subset(df, strata %in% dimnames(tbl)[[2]][colSums(tbl==0)==0])
#  status strata
#1       0    211
#5       1    209
#7       0    209
#8       1    208
#9       1    211
#10      0    208
``````

I think the question is not about checking whether 'df' is equal to 0. Infact, the OP wants to subset the dataset based on the frequency.

A compact option would be to use `data.table`

``````library(data.table)
setDT(df)[, if(uniqueN(status)>1) .SD , by = .(strata)]
#    strata status
#1:    211      0
#2:    211      1
#3:    209      1
#4:    209      0
#5:    208      1
#6:    208      0
``````

i.e. here we are converting the 'data.frame' to 'data.table' (`setDT(df)`), grouped by 'strata', `if` the `length` of the `unique` elements in 'status' is greater than 1 (in this case 2), we get the Subset of Data.table (`.SD`).

An option using the similar logic in `dplyr` is

``````library(dplyr)
df %>%
group_by(strata) %>%
filter(n_distinct(status)>1)
``````

### data

``````set.seed(24)
df <- data.frame(status = sample(0:1, 10, replace=TRUE),
strata = sample(208:213, 10, replace = TRUE))

tbl <- table(df)
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download