C# Question

Regular expressions. Match specific word between two words

I use C#. I have a string:

wordA wordB wordC wordB wordD


I need to match all occurrences of wordB between wordA and wordD.
I use lookahead and lookbehind to match everything between wordA and worD like this:

(?<=wordA)(.*?)(?=wordD)


But something like

(?<=wordA)(wordB)(?=wordD)


matches nothing.
What would be the best way to match all occurrences of wordB between wordA and wordD?

Answer Source

Put the .*? into the lookarounds:

(?<=wordA.*?)wordB(?=.*?wordD)

See the regex demo

Now, the pattern means:

  • (?<=wordA.*?) - (a positive lookbehind) requires the presence of wordA followed with any 0+ chars (as few as possible) immediately before...
  • wordB - word B
  • (?=.*?wordD) - (a positive lookahead) requires the presence of any 0+ chars (as few as possible) followed with a wordD after them (so, it can be right after wordB or after some chars).

If you need to account for multiline input, compile the regex with RegexOptions.Singleline flag so that . could match a newline symbol (or prepend the pattern with (?s) inline modifier option - (?s)(?<=wordA.*?)wordB(?=.*?wordD)).

If the "words" consist of letters/digits/underscores, and you need to match them as whole words, do not forget to wrap the wordA, wordB and wordD with \bs (word boundaries).

Always test your regexes in the target environment:

var s = "wordA wordB wordC wordB \n wordD";
var pattern = @"(?<=wordA.*?)wordB(?=.*?wordD)";
var result = Regex.Replace(s, pattern, "<<<$&>>>", RegexOptions.Singleline);
Console.WriteLine(result);
// => wordA <<<wordB>>> wordC <<<wordB>>> 
//    wordD

See C# demo.