Adam Adam - 1 month ago 7
HTML Question

How would I parse this html in php?

I've exported my Firefox bookmarks as html so I can download my extensive music collection onto my phone, my problem is there is no easy way that I know of.

My intentions is to use PHP to parse the html into an array of the URLs

Heres what the html looks like

<DT><A HREF="https://www.youtube.com/watch?v=Ue8PpA557Bc" ADD_DATE="1477165404" LAST_MODIFIED="1477165404" ICON_URI="https://s.ytimg.com/yts/img/favicon_144-vflWmzoXw.png" ICON="data:image/png;base64,">Don Diablo - Knight Time (Official Music Video) - YouTube</A>


How would I do this?

Answer

If you put in $html a correct html string, you could do it parsing the string with DOMDocument and selecting the href attributes with XPath.

<?php

$html = '<DT><A HREF="https://www.youtube.com/watch?v=Ue8PpA557Bc" ADD_DATE="1477165404" LAST_MODIFIED="1477165404" ICON_URI="https://s.ytimg.com/yts/img/favicon_144-vflWmzoXw.png" ICON="data:image/png;base64,">Don Diablo - Knight Time (Official Music Video) - YouTube</A>';

$doc = new DOMDocument();
$doc->loadHTML($html);

$xpath = new DomXPath($doc);

$nodeList = $xpath->query("//a/@href");

$links_array = [];

foreach($nodeList as $node){
  $links_array[] = $node->nodeValue;
}

echo "<pre>";
print_r($links_array);
echo "</pre>";

The output here is:

Array
(
    [0] => https://www.youtube.com/watch?v=Ue8PpA557Bc
)
Comments