Ankita Rane Ankita Rane - 1 month ago 7
R Question

R remove tags starting with U

How do I remove tags like

<U+0924><U+0930><U+0938><U+0902><U+0917><U+0924>

present in sentences.
Via- <U+0924><U+0930><U+094D><U+0915><U+0938><U+0902><U+0917><U+0924> - Tarksangat ~<U+0938><U+092F><U+094D><U+092F><U+0926> <U+092E><U+0902><U+095B><U+0930> <U+0907><U+092E><U+093E><U+092E>


I need output:
Via- Tarksangat


Can anyone help me? Thanks!

Answer

Hope this might be useful

ab <- unlist(strsplit(abc,"[[:punct:]]"))
ab <- gsub("[[:punct:]]|[0-9]","",ab)

ab <- paste0(ab[nchar(ab)>2],collapse="-")
[1] "Via- Tarksangat "

data

abc <- "Via- <U+0924><U+0930><U+094D><U+0915><U+0938><U+0902><U+0917><U+0924> - Tarksangat ~<U+0938><U+092F><U+094D><U+092F><U+0926> <U+092E><U+0902><U+095B><U+0930> <U+0907><U+092E><U+093E><U+092E>"
Comments