HNSKD HNSKD - 1 month ago 6
R Question

How to use regex for this expression (e.g "6.81E+10")?

I have a vector of strings and I just want to extract those values that take the form

  1. "[digit][.][digit][digit][E][+][digit][digit]" or

  2. "[digit][.][digit][digit][E][+][digit][digit][digit]"

An example would be:

  1. "6.81E+10" and

  2. "5.01E+110"

Let the vector
be as follows:


My command is:


I would like it to return:

[1] "1.23E+110" "1.77E+12"

But instead it returns:

Why can't it work?


Your issue arises because of hte [E+] line. the + operator is used for "1 or more", so you are telling it to look for one or more "E"s, and therefore the "+" does not get matched.

To match the "+" character, you need to escape it, either with \\+ or using a string literal [+]

The immediate fix to your suggested solution is

grep("^[[:digit:]]{1}[.]{1}[[:digit:]]{2}[E][+][[:digit:]]{2,3}$",a, value = T)
# [1] "1.23E+110" "1.77E+12"

But, as others have suggested (in particular @thelatemail), a neater approach is

grep("^\\d[.]\\d{2}E[+]\\d{2,3}$", a, value=TRUE)