foxneSs foxneSs - 1 month ago 6
C# Question

Regex - How to capture an arbitrary string appearing anywhere in a known string?

I need help making a regular expression. I have a string that is known at compile time, let's call it

SpecificString
. I also have another string whose value is not known. Let's call it
ArbitraryString
for example purposes. The input string is made up of one
SpecificString
that contains
ArbitraryString
in it at any position or is adjacent to
ArbitraryString
. I want a regex pattern that captures
ArbitraryString
from the input string for me to use later.




Examples:


  • example format: input string => captured group's value

  • SpecificArbitraryStringString
    =>
    ArbitraryString
    // inside

  • SpecHAHAHALOLificString
    =>
    HAHAHALOL

  • SpecificStringYOLO
    =>
    YOLO
    // adjacent

  • SpecificStrisadng
    =>
    sad

  • itsABea8tifulDaySpecificString
    =>
    itsABea8tifulDay
    // also adjacent

  • Show to be a heartbreakerpecificString
    =>
    how to be a heartbreaker

  • SpecificSt this is the last example ring
    =>
    this is the last example
    (in the output of the last example stackoverflow.com omitted the spaces at both ends for some reason, just ignore that and assume they are there)



I was only able to come up with a regex whose length grows linearly with the length of
SpecificString
making it very difficult to maintain. Any ideas?

Pseudocode (not necessarily valid C#):

static string GetArbitraryString(string input)
{
const string specificString = "SpecificString";
var regex = // regex pattern to find
var match = regex.Match(input);
string arbitraryString = match.CapturedGroups[0].Value;
return arbitraryString;
}


Only regex answers will be accepted.

edit: the new question: Does an elegant regex solution to this even exist?

Answer

Well, here's the best I've got in terms of a regex answer (though it's still pretty damn inelegant in my opinion):

^(.*?)?S(.*?)?p(.*?)?e(.*?)?c(.*?)?i(.*?)?f(.*?)?i(.*?)?(.*?‌​)?c(.*?)?S(.*?)?t(.*‌​?)?r(.*?)?i(.*?)?n(.‌​*?)?g(.*?)?$

Then, all you have to do is iterate over the capture groups and pick up the one that isn't empty. Simple as that.

And, since you're in C#, you can even use named capture groups with the same name for all of them. Whichever one gets picked up will be the value of the named capture.

Demo on Regex101

Comments