rharrington rharrington - 7 months ago 20
Javascript Question

In a regular expression, match one thing or another, or both

In a regular expression, I need to know how to match one thing or another, or both (in order). But at least one of the things needs to be there.

For example, the following regular expression

/^([0-9]+|\.[0-9]+)$/


will match

234


and

.56


but not

234.56


While the following regular expression

/^([0-9]+)?(\.[0-9]+)?$/


will match all three of the strings above, but it will also match the empty string, which we do not want.

I need something that will match all three of the strings above, but not the empty string. Is there an easy way to do that?

UPDATE:

Both Andrew's and Justin's below work for the simplified example I provided, but they don't (unless I'm mistaken) work for the actual use case that I was hoping to solve, so I should probably put that in now. Here's the actual regexp I'm using:

/^\s*-?0*(?:[0-9]+|[0-9]{1,3}(?:,[0-9]{3})+)(?:\.[0-9]*)?(\s*|[A-Za-z_]*)*$/


This will match

45
45.988
45,689
34,569,098,233
567,900.90
-9
-34 banana fries
0.56 points


but it WON'T match

.56


and I need it to do this.

Answer

The fully general method, given regexes /^A$/ and /^B$/ is:

/^(A|B|AB)$/

i.e.

/^([0-9]+|\.[0-9]+|[0-9]+\.[0-9]+)$/

Note the others have used the structure of your example to make a simplification. Specifically, they (implicitly) factorised it, to pull out the common [0-9]* and [0-9]+ factors on the left and right.

The working for this is:

  • all the elements of the alternation end in [0-9]+, so pull that out: /^(|\.|[0-9]+\.)[0-9]+$/
  • Now we have the possibility of the empty string in the alternation, so rewrite it using ? (i.e. use the equivalence (|a|b) = (a|b)?): /^(\.|[0-9]+\.)?[0-9]+$/
  • Again, an alternation with a common suffix (\. this time): /^((|[0-9]+)\.)?[0-9]+$/
  • the pattern (|a+) is the same as a*, so, finally: /^([0-9]*\.)?[0-9]+$/