jason jason - 3 months ago 13
C# Question

Regular expression that works on dots

I have this regular expression :

string[] values = Regex
.Matches(mystring4, @"([\w-[\d]][\w\s-[\d]]+)|([0-9]+)")
.OfType<Match>()
.Select(match => match.Value.Trim())
.ToArray();


This regular expression turns this string :
MY LIMITED COMPANY (52100000 / 58447000)";

To these strings :


MY LIMITED COMPANY - 52100000 - 58447000


This also works on non-English characters.

But there is one problem, when I have this string : MY. LIMITED. COMPANY. , it splits that too. I don't want that. I don't want that regular expression to work on dots. How can I do that? Thanks.

Answer

You may add the dot after each \w in your pattern, and I also suggest removing unnecessary ( and ):

string[] values = Regex
      .Matches("MY. LIMITED. COMPANY. (52100000 / 58447000)", @"[\w.-[\d]][\w.\s-[\d]]+|[0-9]+")
      .OfType<Match>()
      .Select(match => match.Value.Trim())
      .ToArray(); 
foreach (var s in values)
    Console.WriteLine(s);

See the C# demo

Pattern:

  • [\w.-[\d]] - one Unicode letter or underscore ([\w-[\d]]) or a dot (.)
  • [\w.\s-[\d]]+ - 1 or more (due to + quantifier at the end) characters that are either Unicode letters or underscore, ., or whitespace (\s)
  • | - or
  • [0-9]+ - one or more ASCII-only digits