Sharath Sharath - 24 days ago 8
R Question

Remove the row with ID equals zero for all the ID's having values greater than zero.

I have a dataframe like this

ID <- c("A","A","A","B","B","C","D")
Value <- c(0,1,2,0,2,0,0)
df <- data.frame(ID,Value)
df


I am trying to apply the logic that if any ID has values greater than 0, then I need to remove that row of the ID with 0.

My desired output is

ID Value
A 1
A 2
B 2
C 0
D 0


I tried doing it this way

df <- subset(df,df$Value !=0)


I know this is wrong since it removes any ID with 0. Please help with some inputs on how to solve this

Answer

The vanilla way:

# get ids with values greater than 0
delete_zero = unique(subset(df, Value > 0)$ID)

# delete the rows where the ID is in delete_zero AND the value is 0
df2 = subset(df, !(ID %in% delete_zero & Value == 0))

df2
#   ID Value
# 2  A     1
# 3  A     2
# 5  B     2
# 6  C     0
# 7  D     0

The newfangled way: same logic but we do it with dplyr "by group"

library(dplyr)
df %>% group_by(ID) %>%
    filter(!(any(Value > 0) & Value == 0))

# Source: local data frame [5 x 2]
# Groups: ID [4]
# 
#       ID Value
#   <fctr> <dbl>
# 1      A     1
# 2      A     2
# 3      B     2
# 4      C     0
# 5      D     0
Comments