Tony Ladson Tony Ladson - 2 months ago 23
R Question

Replace all occurrences of a string in a data frame

I'm working on a data frame that has non-detects which are coded with '<'. Sometimes there is a space after the '<' and sometimes not e.g. '<2' or '< 2'. I'd like to remove every occurrence of the space.

Example:

data <- data.frame(name = rep(letters[1:3], each = 3), var1 = rep('< 2', 9), var2 = rep('<3', 9))

name var1 var2
1 a < 2 <3
2 b < 2 <3
3 c < 2 <3


This is where I've got to:

I can extract all the values and make the new strings but I can't put them back in the data frame.

index <- str_detect(unlist(data), '<')
index <- matrix(index, nrow = 3)

data[index]
#[1] "< 2" "< 2" "< 2" "<3" "<3" "<3"

replacements <- str_replace_all(data[index], "<[ ]+","<")
replacements
#[1] "<2" "<2" "<2" "<3" "<3" "<3"

data[index] <- replacements

#Error in `[<-.data.frame`(`*tmp*`, index, value = c("<2", "<2", "<2", :
# unsupported matrix index in replacement

Answer

If you are only looking to replace all occurrences of "< " (with space) with "<" (no space), then you can do an lapply over the data frame, with a gsub for replacement:

> data <- lapply(data, function(x) {
+                  gsub("< ", "<", x)
+              })
> data
  name var1 var2
1    a   <2   <3
2    a   <2   <3
3    a   <2   <3
4    b   <2   <3
5    b   <2   <3
6    b   <2   <3
7    c   <2   <3
8    c   <2   <3
9    c   <2   <3