Jeff Meatball Yang Jeff Meatball Yang - 10 months ago 54
Javascript Question

How to prevent Javascript injection attacks within user-generated HTML

I am saving user-submitted HTML (in a database). I must prevent Javascript injection attacks. The most pernicious I have seen is the script in a style="expression(...)".

In addition to this, a fair amount of valid user content will include special characters and XML constructs, so I'd like to avoid a white-list approach if possible. (Listing every allowable HTML element and attribute).

Examples of Javascript attack strings are:


"Hello, I have a
problem with the <dog>


"Hi, this <b
is black."

Is there a way to prevent such Javascript, and leave the rest intact?

The only solution I have so far is to use a regular expression to remove certain patterns. It solves case 1, but not case 2.

Sorry, forgot to mention environment - it's essentially the MS stack:

  • SQL Server 2005

  • C# 3.5 (ASP.NET)

  • Javascript (obviously) and jQuery.

I would like the chokepoint to be the ASP.NET layer - anyone can craft a bad HTTP request.

Edit 2:

Thanks for the links everyone. Assuming that I can define my list (he content will include many mathematical and programming constructs, so a whitelist is going to be very annoying) I still have a question here:

What kind of parser will allow me to just remove the "bad" parts? The bad part could be an entire element, but then what about these scripts that reside in the attributes. I can't remove < a hrefs > willy-nilly.

Answer Source

You think that's it? Check this out.

Whatever approach you take, you definitely need to use a whitelist. It's the only way to even come close to being safe about what you're allowing on your site.


I'm not familiar with .NET, unfortunately, but you can check out stackoverflow's own battle with XSS ( and the code that was written to parse HTML posted on this site: link - obviously you might need to change this because your whitelist is bigger, but that should get you started.