R User R User - 1 month ago 28
R Question

R Extract number from string

I have been trying to get this right. What I want to do is extract a year from a string. The string looks like this for example:

Toy Story (1995)


Or it could look like this

Twelve Monkeys (a.k.a. 12 Monkeys) (1995)


To extract the numbers, I currently use

year = gsub("(?<=\\()[^()]*(?=\\))(*SKIP)(*F)|.", "", x, perl=T)


Now, this would work in most cases, where the first one is used, but in the list the second one is also used.

[1] 1995
[2] a.k.a. 12 Monkeys1995


So obviously I do not want the string but only the year, how do I get this?

Answer

We can use

library(stringr)
as.numeric(str_extract(x, "(?<=\\()[0-9]+(?=\\))"))
#[1] 1995 1995

data

x <-  c("Toy Story (1995)", "Twelve Monkeys (a.k.a. 12 Monkeys) (1995)")