cdn34 cdn34 - 4 months ago 7
HTML Question

Finding a html tag by its content using regex (JS)

What I want to do is find the tag that has the string "test string" even when that tag is nested inside other tags.

HTML example:

<section class="test-class1"><div><p class="test-class2">something else....test string</p></div></section>


Regex :

/.*<([a-zA-Z]*).*>.*?test string/g


Output:

p


I'm using https://regex101.com/#javascript, for the testing;

This regex works well when the html is small, but when the size of the HTML increases, it times out.

Is there a way to improve the performance of the regex ?

Answer

< *(\w+)[^<>]*>[^<]*(?:<[^>]*)*test string

matches p in the first capturing group ($1). Is not possible to speed it up so much. You'd better to use pure JS functions.

Comments