Trevor Johnson Trevor Johnson - 4 months ago 10
R Question

Detect "" using grep() in R

Say I have this vector fish, and I am trying to detect the 4th element "" using grep(). whenever I use the grep() function to detect "" the way I would normally do to detect any other string, it returns all the values.
I have a very large dataset I am working with that has lots of these values and I want to replace them with "other".

fish <- c("a", "b", "c", "")
grep("", fish)


This returns

[1] 1 2 3 4


I would expect it to return

[1] 4


So that I could replace this missing value.

Answer Source

It's insane to use grep() for that. If you must, you can do:

grep("^$", c("a", " ", ""))
# [1] 3

But in R you simply do:

which(c("a", " ","") == "")
# [1] 3

or:

which(nchar(c("a"," ","")) == 0)
# [1] 3

As to why grep() returns all 4 positions: grep() doesn't match strings. It matches parts of the string. So "" is present in all 4 strings. Replace "" with "b" and you see that immediately:

grep("b", c("ab"," b", "b"))
# [1] 1 2 3