Angelo Angelo - 1 month ago 6
C# Question

how to get a part from a string with regular expression in C#

How do I get 'Name' value and 'Age' value?

Case1 Data:

aaa bbbb; Name=John Lewis; ccc ddd; Age=20;


Case2 Data:

AAA bbbb; Age=21;


My regular expression:

(?:Name=(?'name'[\w\b]+)\;)[\s\S]*Age=(?'age'\d+)\;?


But no way to get values(Name, Age).

Answer

Case 1: Only Name is optional

A regex for your case should account for an optional Name field.

(?:\bName=(?<Name>[^;]+).*?;\s+)?\bAge=(?<Age>\d+)
^^^                            ^^

See the regex demo

If Name and Age data are on separate lines, use the regex with RegexOptions.Singleline flag.

Details:

  • (?:\bName=(?<Name>[^;]+).*?;\s+)? - an optional string of subpatterns
    • \bName= - a whole word "Name" + =
    • (?<Name>[^;]+) - Group "Name" capturing 1+ chars other than ;
    • .*? - any 0+ chars (other than newline if (?s) is not used)
    • ; - a semi-colon
    • \s+ - 1 or more whitespaces
  • \bAge= - whole word Age + =
  • (?<Age>\d+) - Capturing group "Age" matching 1+ digits.

C# demo:

var strs = new[] { "aaa bbbb; Name=John Lewis; ccc ddd; Age=20;", "AAA bbbb; Age=21;" };
var pattern = @"(?:\bName=(?<Name>[^;]+).*?;\s+)?\bAge=(?<Age>\d+)";
foreach (var str in strs) 
{
    var result = Regex.Match(str, pattern);
    if (result.Success) 
        Console.WriteLine("Name: \"{0}\", Age: \"{1}\"", result.Groups["Name"].Value, result.Groups["Age"].Value);
}
// => Name: "John Lewis", Age: "20"
//    Name: "", Age: "21"

Case 2: Both Name and Age are optional

Use optional groups for both fields:

(?:\bName=(?<Name>[^;]+).*?;\s+)?(?:\bAge=(?<Age>\d+))?
^^^                            ^^^^^                 ^^

See this C# demo

var strs = new[] { "aaa bbbb; Name=John Lewis; ccc ddd; Age=20;", "AAA bbbb; Age=21;", "Irrelevant", "My Name=Wiktor; no more data" };
var pattern = @"(?:\bName=(?<Name>[^;]+).*?;\s+)?(?:\bAge=(?<Age>\d+))?";
foreach (var str in strs) 
{
    var results = Regex.Matches(str, pattern)
        .Cast<Match>()
        .Where(m => m.Groups["Name"].Success || m.Groups["Age"].Success)
        .Select(p => new {key=p.Groups["Name"].Value, val=p.Groups["Age"].Value} )
        .ToList();
    foreach (var r in results)
        Console.WriteLine("Name: \"{0}\", Age: \"{1}\"", r.key, r.val);
}

Else, if you want to use a more regex engine-friendly pattern, use an alternation with 2 branches where either of the two patterns are obligatory (so as to avoid empty matches handling):

var strs = new[] { "aaa bbbb; Name=John Lewis; ccc ddd; Age=20;", "AAA bbbb; Age=21;", "Irrelevant", "My Name=Wiktor; no more data" };
var pattern = @"(?:\bName=(?<Name>[^;]+).*?;\s+)?\bAge=(?<Age>\d+)|\bName=(?<Name>[^;]+)(?:.*?;\s+\bAge=(?<Age>\d+))?";
foreach (var str in strs) 
{
    var result = Regex.Match(str, pattern);
    if (result.Success)
    {
        Console.WriteLine("Name: \"{0}\", Age: \"{1}\"", result.Groups["Name"].Value, result.Groups["Age"].Value);
    }
}

See this C# demo

The (?:\bName=(?<Name>[^;]+).*?;\s+)?\bAge=(?<Age>\d+)|\bName=(?<Name>[^;]+)(?:.*?;\s+\bAge=(?<Age>\d+))? has 2 branches:

  • (?:\bName=(?<Name>[^;]+).*?;\s+)?\bAge=(?<Age>\d+) - the Name part is optional, Age is compulsory
  • | - or
  • \bName=(?<Name>[^;]+)(?:.*?;\s+\bAge=(?<Age>\d+))? - the Age part is optional, Name is compulsory