ee clipse ee clipse - 3 months ago 11
Java Question

How to Match a String Against Multiple Regex Patterns in Java

I understand how to match a single String against multiple regex patterns using the pipe symbol as explained in some of the answers to this question: Match a string against multiple regex patterns

My question is that when I have the following String:

this_isAnExample of What nav-input a-autoid-9-announce thisIsAnExampleToo


And I use the following regex to extract text:

[A-Z][a-z]*|(?<=_)[A-Za-z-]*


I am expecting to get the following matches:

is
An
Example
What
Is
An
Example
Too


But I actually get is:

isAnExample
What
Is
An
Example
Too


Basically the engine is automatically linking the word An with Example bec it matches the underscore pattern but I want it to treat them as two words (non greedy?) bec according to the other pattern there is another match.

Answer

You probably ment the regex to be

[A-Z][a-z]*|(?<=_)[a-z-]*

The first part being lowercase word starting with uppercase letter, or the second: lowercase word preceded by underscore.

The part of your posted regex (?<=_)[A-Za-z-]* matches lower and upper case letters after underscore, i.e. does not stop matching when uppercase letter met, which should be in fact start of another word.