Dee Dee - 1 month ago 5
R Question

Regular exression for non-english characters

I need to check if some strings contain any non-English characters.

x = c('Kält', 'normal', 'normal with, punctuation ~-+!', 'normal with number 1234')
grep(pattern = ??, x) # Expected output:1


Any ideas?

Answer

You may use [^[:ascii:]] PCRE regex:

x = c('Kält', 'normal', 'normal with, punctuation ~-+!', 'normal with number 1234')
grep(pattern = "[^[:ascii:]]", x, perl=TRUE) 
grep(pattern = "[^[:ascii:]]", x, value=TRUE, perl=TRUE) 

Ouput:

[1] 1
[1] "Kält"

See the R demo

Comments