Vlad117 Vlad117 - 3 months ago 26
R Question

Replace part of a string (text mining)

I would like to replace "Replace" part within strings from df$x to the first word of df$y column. I have a df like this:

x y
ABC-Replace-YUI M46 Hello
CBD-Replace-TYU MD5 Hello
DBE-Replace-RTY M6 Hello
EBF-Replace-ERT M79 Hello
FBG-Replace-WER MMM8 Hello


And I would like to get the following data:

x y
ABC-M46-YUI M46 Hello
CBD-MD5-TYU MD5 Hello
DBE-M6-RTY M6 Hello
EBF-M79-ERT M79 Hello
FBG-MMM8-WER MMM8 Hello


Unfortunately, I have no experience in text mining and I need the most efficient way to do that as I have a huge dataset with similar substitutions for each row. Thank you.

Answer

We can use str_replace to replace the 'Replace' with the first word of each string in 'y' column (extracted with word)

library(stringr)
df1$x <- str_replace(df1$x, "Replace", word(df1$y,1))
df1$x
#[1] "ABC-M46-YUI"  "CBD-MD5-TYU"  "DBE-M6-RTY"   "EBF-M79-ERT"  "FBG-MMM8-WER"

data

df1 <- structure(list(x = c("ABC-Replace-YUI", "CBD-Replace-TYU", "DBE-Replace-RTY", 
"EBF-Replace-ERT", "FBG-Replace-WER"), y = c("M46 Hello", "MD5 Hello", 
"M6 Hello", "M79 Hello", "MMM8 Hello")), .Names = c("x", "y"), 
class = "data.frame", row.names = c(NA, -5L))