Cornwell Cornwell - 9 days ago 6
C# Question

Regex capture order: wrong alternative matched after greedy pattern

I have this pattern:

(\w+)(sin|in|pak|red)$


And the replacement pattern is this:

$1tak


The problem is that this word:


setesin


will be transformed to:


setestak


instead of


setetak


For some reason,
in
always takes precedence to
sin
in the pattern.

How can I enforce the pattern to follow that order?

Answer

Use a lazy quantifier:

(\w+?)(sin|in|pak|red)$
    ^

See the regex demo

The \w+ contains a greedy quantifier that: 1) grabs as many chars as it can (and note it can match s, i, all letters, digits and underscores) and then backtracks (yielding one char after another moving from right to left), trying to accommodate for the subsequent patterns. Since the in is found first, it is matched, and the whole group is considered matched, the regex goes on to check the end of string with $. A lazy quantifier will have the regex engine skip the \w+? after matching 1 word char, and other patterns will be tried, moving from left to right.