Remi.b Remi.b - 2 months ago 8
R Question

Extracting numbers from string in R

I have a list of strings which contain random characters such as:

list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"


I want to know which numbers are present at least once (
unique()
) in this list. The solution of my example is:

solution:
c(7,667,11,5,2)


If someone has a method that does not consider 11 as "eleven" but as "one and one", it would also be useful. The solution in this condition would be:

solution:
c(7,6,1,5,2)


(I found this post on a related subject: Extracting numbers from vectors (of strings))

Answer

For the second answer, you can use gsub to remove everything from the string that's not a number, then split the string as follows:

unique(as.numeric(unlist(strsplit(gsub("[^0-9]", "", unlist(ll)), ""))))
# [1] 7 6 1 5 2

For the first answer, similarly using strsplit,

unique(na.omit(as.numeric(unlist(strsplit(unlist(ll), "[^0-9]+")))))
# [1]   7 667  11   5   2

PS: don't name your variable list (as there's an inbuilt function list). I've named your data as ll.

Comments