user3639557 user3639557 - 5 months ago 10
Java Question

regex fails to catch all matches

Here is an example:

The two (Senior Officer Stuart & Officer Jess) were intercepted by Officer George.


Now, let's say I have two ranks "Officer" and "Senior Officer" and want to
replace the name after them with a general token "PERSON". As you can see there are three names that come after a rank
Stuart, Jess, George
. I don't know why my regex solution fails to capture all of them. Here is my code:

public static void main(String[] args) {
String input = "The two (Senior Officer Stuart & Officer Jess) were intercepted by Officer George.";
ArrayList<String> ranks = new ArrayList<String>();
ranks.add("Senior Officer");
ranks.add("Officer");
for (String rank : ranks) {
Pattern pattern = Pattern.compile(".*" + rank + " ([a-zA-Z]*?) .*");
Matcher m = pattern.matcher(input);
if (m.find()) {
System.out.println(rank);
System.out.println(m.group(1));
}
}
}


and here is its output:

Senior Officer
Stuart
Officer
Stuart


which captures Stuart twice (via Senior Officer and Officer), but ignores Jess and George. I am expecting to get this as the output:

Senior Officer
Stuart
Officer
Stuart
Officer
Jess
Officer
George

Answer

This will be sufficient

for (String rank : ranks) {
    Pattern pattern = Pattern.compile("(?:" + rank + ")" + "\\s+([a-zA-Z]*)");
    Matcher m = pattern.matcher(input);
    while (m.find()) {
        System.out.println(rank);
        System.out.println(m.group(1));
    }
}

Ideone Demo