Steve Steve - 3 months ago 32
ASP.NET (C#) Question

Microsoft asp c# RegEx escape multiple reserved characters

I just want a use of regex that will return true/false if a character exists in a string that is not a regularly type-able character. This should be an easy thing to do shouldn't it?

I don't have pattern, pre se, I just want to know if any character exists that isn't in the list.

In the regular RegEx world I simply:

[^0-9a-zA-Z~`!@#$%\^ &*()_-+={\[}]|\\:;\"'<,>.?/] // <space> before the ampersand

...I know a little bloated but makes the point for this post...

I find you can't escape multiple reserved characters.
For instance, Regex ex = Regex.Escape("[") + Regex.Escape("^") will not hit on:
"st[eve" or "st^ve"

as in the following fails:

string ss = Regex.Escape("[") + Regex.Escape("^");
Regex rx = new Regex(ss);
string s = "st^eve";

as will any of these:

string ss = Regex.Escape("[") + "[0-9]";
Regex rx = new Regex(ss);
string s1 = "st^eve"; rx.IsMatch(s1));
string s2 = "st^ev0e; rx.IsMatch(s2));
string s3 = "stev0e; rx.IsMatch(s3));

But this is the only use of Microsoft c# Regex escaped characters that won't fail:

string ss = Regex.Escape("^");
Regex rx = new Regex(ss);
string s = "st^eve"; rx.IsMatch(s));

Do I have to develop a separate test for EACH escape-necessary characters IN ADDITION TO a test for the non escaped characters?

Is this what other people are doing?

I'm open to ideas if there a better way?

Thanks for you consideration.


Think about what you're generating as an expression. Your example RegEx

string ss = Regex.Escape("[") + Regex.Escape("^");

is equivalent to:

string ss = @"\[\^";

That is, it's not looking for [ or ^, it's looking for [ followed by ^. So ste[^ve would match.

If you want to match any string containing one or more of the characters, you need to add (non-escaped) brackets to create a set of characters, such as:

string ss = "[" + Regex.Escape("[") + Regex.Escape("^") + "]"

That is, you are asking the Regex engine to look for one character within the set of characters in the brackets.