Tim Tim - 26 days ago 14
HTML Question

Parsing link from table with PHP

I'm using the following code to parse table elements from the Australian Securities Exchange:

<?php

$dom = new DOMDocument();


//load the html
$html = $dom->loadHTMLFile("http://www.asx.com.au/asx/statistics/prevBusDayAnns.do");

//discard white space
$dom->preserveWhiteSpace = false;

//the table by its tag name
$tables = $dom->getElementsByTagName('table');

//get all rows from the table
$rows = $tables->item(0)->getElementsByTagName('tr');
// get each column by tag name
$cols = $rows->item(0)->getElementsByTagName('th');
$row_headers = NULL;
foreach ($cols as $node) {
//print $node->nodeValue."\n";
$row_headers[] = $node->nodeValue;
}

$table = array();
//get all rows from the table
$rows = $tables->item(0)->getElementsByTagName('tr');

foreach ($rows as $row)
{
// get each column by tag name

$cols = $row->getElementsByTagName('td');

$companysymbol = $cols->item(0)->nodeValue;
$pubtime = $cols->item(1)->nodeValue;
$newstitle = $cols->item(3)->nodeValue;

$row = array();

echo $companysymbol . '<br>';
echo $pubtime . '<br>';
echo $newstitle . '<br><br>';

}

?>


The code is working fine, but in addition to echo the $companysymbol, $pubtime and $newstitle I would like to echo the link (PDF link) inside the table. Can someone tell me how?

Thanks in advance for your help!!

mx0 mx0
Answer

You need to get the href attribute, and then recreate link.

(...)
$pdflink = $cols->item(5)->nodeType === XML_ELEMENT_NODE ? $cols->item(5)->getElementsByTagName('a')->item(0)->getAttribute('href') : '';
(...)
echo "<a href='http://www.asx.com.au$pdflink'>pdf</a>".'<br>';

This will create clickable link to pdf.