Denis Denis - 1 month ago 13
R Question

Subset rows only contain letters in R

My vector have around 3000 observations like:

clients <- c("Greg Smith", "John Coolman", "Mr. Brown", "John Nightsmith (father)", "2 Nicolas Cage")


How I can subset rows that contain only names with letters. For example, only Greg Smith, John Coolman (without symbols like 0-9,.?:[} etc.).

Answer

We can use grep to match only upper or lower case alphabets along with space from start (^) to end ($) of the string.

grep('^[A-Za-z ]+$', clients, value = TRUE)
#[1] "Greg Smith"   "John Coolman"

Or just use the [[:alpha:] ]+

grep('^[[:alpha:] ]+$', clients, value = TRUE)
#[1] "Greg Smith"   "John Coolman"