I have a string which contains certain class names + method calls. I'd like to extract these but the regex I wrote to do this returns more matches than intended.
#!/usr/bin/perl
$someInput = "Lorem ipsum dolor sit CLASS3.aMethod.anotherMethod amet, consetetur sadipscing CLASS1.bMethod elitr, sed diam nonumy eirmod";
@matches = ($someInput =~ /((CLASS1|CLASS2|CLASS3)(\.[A-Za-z0-9_]+){1,})/g);
foreach my $match (@matches) {
print $match . "\n";
}
CLASS3.aMethod.anotherMethod
CLASS3
.anotherMethod
CLASS1.bMethod
CLASS1
.bMethod
CLASS3.aMethod.anotherMethod
CLASS1.bMethod
The regular expression
((CLASS1|CLASS2|CLASS3)(\.[A-Za-z0-9_]+){1,}))
defines 3 capture groups, each delineated by a balanced pair or parentheses:
1: ((CLASS1|CLASS2|CLASS3)(\.[A-Za-z0-9_]+){1,})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2: ((CLASS1|CLASS2|CLASS3)(\.[A-Za-z0-9_]+){1,})
^^^^^^^^^^^^^^^^^^^^
3: ((CLASS1|CLASS2|CLASS3)(\.[A-Za-z0-9_]+){1,})
^^^^^^^^^^^^^^^
and so when one expression is successfully matched against this expression, the result will be 3 matches.
Use the (?:...)
syntax to keep an expression together as a logical unit in a regular expression but to designate it as a non-capturing group. This expression will only return one match on success:
((?:CLASS1|CLASS2|CLASS3)(?:\.[A-Za-z0-9_]+){1,})