I'm working on a mini-CMS module for one of my projects, where users are allowed to edit content in markdown. I'm using
for parsing and showing a preview.
I was thinking a lot about how to send the input to the server, and also how to store it in the database. I came to a conclusion to avoid duplicating the markdown parsing at server-side, and send both markdown and the parsed HTML to the server. I think nowadays the added overhead is minimal, even on a site where edits are heavy.
So at final stage I still need to validate the HTML sent to the server, as it can be a security bottleneck of the system. I've read a lot about Microsoft's implementation of AntiXSS, and how it is (or was) quite unusable for such scenarios, as it was too gready. For example I've found this article
with even a helper code (using HTMLAgilityPack) to give a usable sanitizing implementation.
Unfortunately I haven't found anything newer than 2013 on this topic. I'd like to ask at present how to do a proper HTML encoding where there are allowed tags and attributes, but still safe from any kind of XSS attacks? Is such a code like in the article still needed, or are there any built-in solutions?
Also, if my choice of client-side markdown parsing is not viable, what are some other options? What I want to avoid, is duplicating all kinds of markdown logic at both client and server. For example I've prepared several custom extensions for