PeregrineStudios PeregrineStudios - 1 year ago 85
Javascript Question

RegEx for specific pattern, excluding URLs

Long story, but I need to take some fakey-HTML and replace it with real HTML using JavaScript. For example:

{span class:text-bold data:attribute}TITLE{/span}

Needs to change into:

<span class="text-bold" data="attribute">TITLE</span>

I'm using RegEx to do this as I can't possibly anticipate every attribute that could be placed on every element. The expression that is more or less working to find every instance of data:attribute:


However, there is an issue; this expression also matches URLs, for example:

In an attempt to exclude any URLs from matching, I changed the expression like so:


However, this did not have the expected effect, the pattern continues to match URLs, just without the leading 'h'. For example,


I'll admit I haven't used RegEx in quite a while, so I'm probably misunderstanding something. How can I tell a RegEx pattern to NOT match any match that begins with a specific set of characters?

Answer Source

You need a negated look-ahead right before the possible //, i.e. after the colon.

"foo://bar".match(/(\w+:)(?!\/\/)([^\s\}]*)/); //no dice
"foo:bar".match(/(\w+:)(?!\/\/)([^\s\}]*)/); //dice

Of course, this will also block any attribute values that legitimately begin with //, but I assume that's a risk worth taking.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download