Max Max - 3 months ago 28
PHP Question

extract stylesheets via regex

Yes, I know, I know, parsing HTML with regular expressions is very bad. But I am working with legacy code that is supposed to extract all

elements from a html page. I would change it and use the
extension instead, but after the regex there is a huge code block which relies on the way
returns the matched results.

The script is using this regex:

$pattern = '/<(link|style)(?=.+?(?:type="(text\/css)"|>))(?=.+?(?:media="(.*?)"|>))(?=.+?(?:href="(.*?)"|>))(?=.+?(?:rel="(.*?)"|>))[^>]+?\2[^>]+?(?:\/>|<\/style>)\s*/is';

preg_match_all($pattern, $htmlContent, $cssTags);

But it doesnt work. No elements are matched. Unfortunately I really suck at regex, so if someone could help me out it would be great.

Max Max

Thanks at all for your answers, but I finally rewrote that bit using the DOM extension. That should make it way more robust.