Mikusch Mikusch - 6 months ago 37
HTML Question

Regex to match anything with href="" but between two other tags

I already have this Regex pattern that checks for every

in my document:


Now I want it to match all
s ONLY in between
tags, with other parameters still allowed in between.

Do not match:

<base href="http://www.w3schools.com/images/" target="_blank">

<link rel="apple-touch-icon" sizes="57x57" href="/apple-icon-57x57.png">


<a href="http://www.w3schools.com/"></a>

<a class="re" href="http://www.w3schools.com/"></a>

<a href="http://www.w3schools.com/" class="re">This is a link</a>

Thanks in advance, I've not been able to solve this problem as of yet.


Note: Due to the limitation of language classes (regular, stack), this can't be done 100%. But a close approximation is:


Or, if you use named subexpressions:


Which will also handle apostrophe-delimited attributes.

The last example can also be rewritten as:


but remember to use the second subexpression, not the first one.

It wasn't clear whether you want to grab the name of the link, but if you do, whichever regex you choose, you can add a simple appendix to get the name. For example, for the named subexpressions: