Emran Sadeghi Emran Sadeghi - 4 months ago 16
C# Question

Regex get group block with specific start and end each group

If we had some string like :

----------DBVer=1
/*some sql script*/
----------DBVer=1
----------DBVer=2
/*some sql script*/
----------DBVer=2
----------DBVer=n
/*some sql script*/
----------DBVer=n


Can we extract scripts between first DBVer=1 and second DBVer=1 and so on... with regex?

I thing we must have some placehoder for regex, and tel regex engine if saw DBVer=digitA pick string until DBVer=digitA again if saw DBVer=digitB pick string until DBVer=digitB and so on...

Can we implement this with regex and if we can how?

Answer

Yes, using backreferences, you can capture the scripts:

var pattern = @"(?<=(-{10}DBVer=(?<id>\d+))\r?\n).*(?=\r?\n-{10}DBVer=\k<id>)";
var scripts = Regex.Matches(input, pattern, RegexOptions.Singleline)
                .Cast<Match>()
                .Select(m => m.Value);

Here, we capture the id group with (?<id>\d+) and reuse the id value later in the regex with \k<id>.

In order for .* to match newline chars, it is necessary to turn on Singleline mode. This, in turn, means we have to be specific about our newlines. In Singleline mode, these can be accounted for in a non-platform specific way with \r?\n.