tbkn23 tbkn23 - 3 months ago 7
C# Question

.Net Regex - last of repeating characters

I'm trying to capture everything inside curly bracers, but in some cases there may be multiple bracers and I want the external ones.

For example: I want to capture

{{this}}
part
I'll need
{{this}}
as the capture.

So I went with
({[^}]+}+)
to capture the inner text, but of course this will yield multiple captures
{{this}
and
{{this}}
.

So I tried telling the regex to search for the phrase but only if the next character is not curly bracers:
({[^}]+}+)[^}]
. This works, unless the capture is at the end of the input, in which case it doesn't work cause it expects a non
}
character at the end.

So I tried adding end of string option
({[^}]+}+)[$|^}]
, but for some reason, this will capture
{{this}
again. I have no idea why, it should only capture if the next char is end of input or not curly bracers...

Suggestions?

Edit:

Just to be clear, I'm not searching for valid nested parenthesis, only for text between { and the first matching } (no nesting!), however there may be cases where instead of one open/close brace there are two (so {something} and {{something}} both need to be caught).

The reason for this, is that the original text always has double braces {{ }}, but sometimes before the regex the text undergoes string.Format, in which case the double braces become single braces.

Answer

You seem to have used a character class at the end instead of a non matching group. Try:

({[^}]+}+)(?:$|[^}])

This is a very small modification to your final attempt, that just uses correct syntax. In your final attempt you have [$|^}]. The issue with this is that you can't have an or | inside a character class []. Most special characters are escaped inside a character class, with a couple exceptions, one of which is ^ if it is the first character. So [$|^}] means any of the four literal characters $, |, ^, or }. What I did is change the syntax to what you intended by using a non-matching group (?:stuff) this group does not save its contents and is purely for grouping. As such (?:$|[^}]) means an end-of-line or a non-}, as you wanted.

Note that this makes no effort to balance the curly braces (match the number of braces at the beginning and end).

Comments