good-to-know good-to-know - 1 month ago 8
C# Question

Split String by Regex Expression

This is my string.

19282511~2017-08-28 13:24:28~Entering (A/B)~1013~283264/89282511~2017-08-28 13:24:28~Entering (A/B)~1013~283266/79282511~2017-08-28 13:24:28~Entering (A/B)~1013~283261


I would like this string be split like below:

19282511~2017-08-28 13:24:28~Entering (A/B)~1013~283264
89282511~2017-08-28 13:24:28~Entering (A/B)~1013~283266
79282511~2017-08-28 13:24:28~Entering (A/B)~1013~283261


I cannot split my string blindly by slash (/) since there is a value
A/B
will also get split.

Any idea of doing this by regex expression?

Your help will definitely be appreciated.

Answer Source

You may split with / that is in between digits:

(?<=\d)/(?=\d)

See the regex demo

Details

  • (?<=\d) - a positive lookbehind that requires a digit to appear immediately to the left of the current location
  • / - a / char
  • (?=\d) - a positive lookahead that requires a digit to appear immediately to the right of the current location.

Since the \d pattern is inside non-consuming patterns, only / will be removed upon splitting and the digits will remain in the resulting items.

enter image description here

Another idea is to match and capture these strings using

/?([^~]*(?:~[^~]*){3}~\d+)

See this regex demo.

Details

  • /? - 1 or 0 / chars
  • ([^~]*(?:~[^~]*){3}~\d+) - Group 1 (what you need to grab):

    • [^~]* - zero or more chars other than ~
    • (?:~[^~]*){3} - 3 or more sequences of ~ and then 0+ chars other than ~
    • ~\d+ - a ~ and then 1 or more digits. The C# code will look like

    var results = Regex.Matches(s, @"/?([^~](?:~[^~]){3}~\d+)") .Cast() .Select(m => m.Groups1.Value) .ToList();

enter image description here

NOTE: By default, \d matches all Unicode digits. If you do not want this behavior, use the RegexOptions.ECMAScript option, or replace \d with [0-9] to only match ASCII digits.