Admiral Land Admiral Land - 1 month ago 8
C# Question

How to split string with Regex.Split and keep all separators?

How to split string with Regex.Split and keep all separators?

I have a string:"substring1 delimeter1 substring2" , where delimeter+substring2 is a part of address.

Also i have 2 and more delimeters: delim1,delim2 wich are equivalent in meaning;

And i want to get string array like this:

arr[0]="subsctring1";
arr[1]="delim1 subsctring2";


or,

arr[1]="delim2 subsctring2;


I have a pattern:

addrArr= Regex.Split(inputText, String.Concat("(?<=",delimeter1, "|",delimeter2, ")"), RegexOptions.None);


But it not works well.

Can you help me to create a valid pattern to to that?

Answer

You need a pattern with a lookahead only:

\s+(?=delim1|delim2)

The \s+ will match 1 or more whitespaces (since your string contains whitespaces). In case there can be no whitespaces, use \s* (but then you will need to remove empty entries from the result). See the regex demo. If these delimiters must be whole words, use \b word boundaries: \s+(?=\b(?:delim1|delim2)\b).

In C#:

addrArr = Regex.Split(inputText, string.Format(@"\s+(?={0})", string.Join("|", delimeters)));

If the delimiters can contain special regex metacharacters, you will need to run Regex.Escape on your delimiters list.

A C# demo:

var inputText = "substring1 delim1 substring2 delim2 substr3";
var  delimeters = new List<string> { "delim1", "delim2" };
var addrArr = Regex.Split(inputText, 
        string.Format(@"\s+(?={0})", string.Join("|", delimeters.Select(Regex.Escape))));
Console.WriteLine(string.Join("\n", addrArr));
Comments