TheEditor TheEditor - 1 year ago 164
HTML Question

Simple HTML DOM getting all attributes from a tag

Sort of a two part question but maybe one answers the other. I'm trying to get a piece of information out of an

<div id="foo">
<div class="bar"><a data1="xxxx" data2="xxxx" href="">Inner text"</a>
<div class="bar2"><a data3="xxxx" data4="xxxx" href="">more text"</a>

Here is what I'm using now.

$articles = array();
foreach($html->find('div[class=bar] a') as $a){
$articles[] = array($a->href,$a->innertext);

This works perfectly to grab the href and the inner text from the first div class. I tried adding a $a->data1 to the foreach but that didn't work.

How do I grab those inner data tags at the same time I grab the href and innertext.

Also is there a good way to get both classes with one statement? I assume I could build the find off of the id and grab all the div information.


Answer Source

To grab all those attributes, you should before investigate the parsed element, like this:

foreach($html->find('div[class=bar] a') as $a){

...and see if those attributes exist. They don't seem to be valid HTML, so maybe the parser discards them.

If they exist, you can read them like this:

foreach($html->find('div[class=bar] a') as $a){
  $article = array($a->href, $a->innertext);
  if (isset($a->attr['data1'])) {
    $article['data1'] = $a->attr['data1'];
  if (isset($a->attr['data2'])) {
    $article['data2'] = $a->attr['data2'];
  $articles[] = $article;

To get both classes you can use a multiple selector, separated by a comma:

foreach($html->find('div[class=bar] a, div[class=bar2] a') as $a){