Baylock Baylock - 6 months ago 24
jQuery Question

Regex: allow everything but some selected characters

I would like to validate a textarea and I just don't get regex (It took me the day and a bunch of tutorials to figure it out).

Basically I would like to be able to allow everything (line breaks and chariots included), but the characters that could be malicious( those which would lead to a security breach).
As there are very few characters that are not allowed, I assume that it would make more sense to create a black list than a white one.

My question is: what is the standard "everything but" in Regex?

I'm using javascript and jquery.

I tried this but it doesn't work (it's awful, I know..):

var messageReg = /^[a-zA-Z0-9éèêëùüàâöïç\"\/\%\(\).'?!,@$#§-_ \n\r]+$/;


Thank you.

Answer

As Esailija mentioned, this won't do anything for real security.

The code you mentioned is almost a negated set, as murgatroid99 mentioned, the ^ goes inside the brackets. So the regular expression will match anything that is not in that list. But it looks like you really want to strip out those characters, so your regexp doesn't need to be negated.

Your code should look like:

str.replace(/[a-zA-Z0-9éèêëùüàâöïç\"\/\%\(\).'?!,@$#-_ \n\r]/g, "");

That says, remove all the characters in my regular expression.

However, that is saying you don't want to keep a-zA-Z0-9 are you sure you want to strip those out?

Also, chrome doesn't like § in Regular Expressions, you have to use the \x along with the hex code for the character