Jonas Lindeløv Jonas Lindeløv - 2 months ago 7
R Question

Find string in data.frame



How do I search for a string in a data.frame? As a minimal example, how do I find the locations (columns and rows) of 'horse' in this data.frame?

> df = data.frame(animal=c('goat','horse','horse','two', 'five'), level=c('five','one','three',30,'horse'), length=c(10, 20, 30, 'horse', 'eight'))
> df
animal level length
1 goat five 10
2 horse one 20
3 horse three 30
4 two 30 horse
5 five horse eight


... so row 4 and 5 have the wrong order. Any output that would allow me to identify that 'horse' has shifted to the
level
column in row 5 and to the
length
column in row 4 is good. Maybe:

> magic_function(df, 'horse')
col row
'animal', 2
'animal', 3
'length', 4
'level', 5


Here's what I want to use this for: I have a very large data frame (around 60 columns, 20.000 rows) in which some columns are messed up for some rows. It's too large to eyeball in order to identify the different ways that order can be wrong, so searching would be nice. I will use this info to move data to the correct columns for these rows.

Answer

What about:

which(df == "horse", arr.ind = TRUE)
#      row col
# [1,]   2   1
# [2,]   3   1
# [3,]   5   2
# [4,]   4   3