Arturas M Arturas M - 6 months ago 21
Java Question

Reluctant quantifier acting greedy

I have this code:

String result = text;

String regex = "((\\(|\\[)(.+)(\\)|\\])){1}?";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(result);

System.out.println("start");
System.out.println(result);
while (matcher.find()) {
System.out.print("Start index: " + matcher.start());
System.out.print(" End index: " + matcher.end() + " ");
System.out.println(matcher.group());
}
System.out.println("finish");


And I have a string that I want to match:

Some text sentence or sentences [something 234] (some things)


And the output I get when executing:

start
some text sentence or sentences [something 234] (some things)
Start index: 32 End index: 61 [something 234] (some things)
finish


Now I actually want it to find the found cases in brackets separately, so to find:
[something 234] in one match
(some things) as the second match

Can anyone please help me build the regex accordingly? I am not sure how to put the reluctant quantifier for the whole regular expression, so I surrounded the whole bracketed elements in another brackets. But I don't understand why this reluctant quantifier is acting greedy here and what do I need to do to change that?

Answer

{1} in regex is redundant since any element without specified quantifier needs to be found once. Also making it reluctant doesn't make sense since it doesn't describe range of possible repetitions (like {min,max} where adding ? would tell regex engine to make number of repetitions in that range as close to min as possible). Here {n} describes precise number of repetition so min = max = n.

Now you should be able to solve your problem by making .+ (content between brackets) reluctant. To do so use .+?.

So try with:

String regex = "((\\(|\\[)(.+?)(\\)|\\]))";