Barry D. Barry D. - 3 months ago 15
C# Question

Replacing Regex String in C#

I have a

Regexrule.cs
Class, it consists of the following properties:

public string Expression { get; set; }
public string FirstOpen { get; set; }
public string FirstClose { get; set; }
public string SecondOpen { get; set; }
public string SecondClose { get; set; }


Expression
holds a Regular Expression value, and it is always expected to return 2 Groups.

The four fields (excluding
Expression
) are prefixes and suffixes for the two groups that are expected to be found... so that this happens:

FirstOpen + Group[1] + FirstClose
and
SecondOpen + Group[2] + SecondClose


Anyway, I have a
List<RegexRule> Rules;
that contains a list of
RegexRules
objects.


The Predicament


My goal is to loop through each one one those (
RegexRules r
), run its respective expression (
r.Expression
) on a particularly long string, and when the two expected groups are found, I want the script to encapsulate each group with its prefixes and suffixes in the way shown...again,

r.FirstOpen + Group[1] + r.FirstClose
and
r.SecondOpen + Group[2] + r.SecondClose


I've tried many different ways but one thing I know is that
str.Replace
isn't going to work, in a loop. Because it will apply the prefixes and suffixes over and over, for every occurrence of the expression's results.

So how else can this be achieved?

Thank you.


Edit


This is what I've currently got:

foreach (RegexRule r in RegexRules.ToList())
{
Regex rx = new Regex(r.Expression);
MatchCollection mc = rx.Matches(str);
foreach (Match m in mc)
{
MessageBox.Show("replacing");
str = str.Replace(m.Groups[1].Value, r.OpenBBOne + m.Groups[1].Value + r.CloseBBOne);
}
}



Edit 2 - Specifics


Users will create their own Regex configurations in a
.config
file, and it will be in this format:

reg {(\w+).(\w+)\(\);} = [("prefix1","suffix1"),("prefix2","suffix2")];



reg
- Standard word for defining a new RegexRule


{ {(\w+).(\w+)\(\); }
- Their Regular expression (CONDITION: expression must always return 2 groups in its matches)


[("prefix1","suffix1"),("prefix2","suffix2")]
- Two parameters in `[("","") , ("","")] - which represent the prefixes and suffixes for the two groups

**Example **

If we applied the above configuration to this string:

Lorem ipsum foo.bar(); dolor sit bar.foo(); amit consecteteur...


The regex would capture
foo.bar()
as a match, in that
foo
is match[1] group[1], and
bar
is match[1] group[2], according to the regular expression.

Same goes for
bar.foo()
, because
bar
is match[2] group[1], and
foo
match[2] group[2]

I hope this makes sense...

Answer

As per our discussion, I think this might be a solution for you. It has to do with the first comment I made. It gives you unique values for your MatchCollection using .Distinct() so that you don't end up compounding the prefixes and suffixes.

foreach(RegexRule r in RegexRules.ToList())
{ 
    Regex rx = new Regex(r.Expression); 
    MatchCollection mc = rx.Matches(str); 
    foreach(Match m in mc.OfType<Match>().Distinct()) 
    { 
         MessageBox.Show("replacing");
         str = str.Replace(m.Groups[1].Value, 
                           r.OpenBBOne + m.Groups[1].Value + r.CloseBBOne);
    }
}

If you can't use LINQ for some reason, you can always just basically do the same thing yourself by creating a new List<Match> and only adding in the ones that aren't yet in the list.

foreach(RegexRule r in RegexRules.ToList())
{ 
    Regex rx = new Regex(r.Expression); 
    MatchCollection mc = rx.Matches(str);

    List<Match> matches = new List<Match>();
    List<string> strings = new List<string>();
    foreach(Match m in mc)
        if(!strings.Contains(m.Value))
        {
            matches.Add(m);
            strings.Add(m.Value);
        }

    foreach(Match m in matches) 
    { 
         MessageBox.Show("replacing");
         str = str.Replace(m.Groups[1].Value, 
                           r.OpenBBOne + m.Groups[1].Value + r.CloseBBOne);
    }
}