Science11 Science11 - 15 days ago 5
R Question

Filter a dataframe by its entry

How do I filter a dataset by a specific value that can occur anywhere in the data frame and not necessarily under any one column or row ?

Suppose I have a data frame that is like this.

id gender group Student_Math_1 Student_Math_2 Student_Read_1 Student_Read_2
46 M Red 23 45 37 56
46 M Red 34 36 33 78
46 M Red 56 63 58 NA
62 F Blue 59 NA NA 68
62 F Blue NA 68 87 73
38 M Red 78 57 NA 65
38 M Red NA 75 54 NA
17 F Blue 74 NA 56 72
17 F Blue 75 61 NA 79
17 F Blue NA 74 43 81


And I am trying to subset this data frame such that I retain all the rows and columns that contain the value
68
regardless of where it occurs within the data frame.

The final output would be

id gender group Student_Math_1 Student_Math_2 Student_Read_1 Student_Read_2

62 F Blue 59 NA NA 68
62 F Blue NA 68 87 73


Any tips or suggestions are welcome. Thanks in advance.

df = structure(list(id = c(46, 46, 46, 62, 62, 38, 38, 17, 17, 17),
gender = structure(c(2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L,
1L), .Label = c("F", "M"), class = "factor"), group = structure(c(2L,
2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("Blue", "Red"
), class = "factor"), Student_Math_1 = c(23, 34, 56, 59,
NA, 78, NA, 74, 75, NA), Student_Math_2 = c(45, 36, 63, NA,
68, 57, 75, NA, 61, 74), Student_Read_1 = c(37, 33, 58, NA,
87, NA, 54, 56, NA, 43), Student_Read_2 = c(56, 78, NA, 68,
73, 65, NA, 72, 79, 81)), .Names = c("id", "gender", "group",
"Student_Math_1", "Student_Math_2", "Student_Read_1", "Student_Read_2"
), row.names = c(NA, -10L), class = "data.frame")

989 989
Answer

Alternatively,

df[unique(which(df==68, arr.ind = T)[,1]),]

#  id gender group Student_Math_1 Student_Math_2 Student_Read_1 Student_Read_2
#5 62      F  Blue             NA             68             87             73
#4 62      F  Blue             59             NA             NA             68

In this case, you don't need to care about the position of the columns or where NAs are appeared. unique is used in case 68 is appeared more than once per row.