Punxor Punxor - 6 months ago 29
HTML Question

Simple HTML Dom not working as expected

Hey so am trying to parse this html but its not working as expected i managed to get the title and link but i need to get the Size but its just not selecting it for some reason i tried this method

Please note i'm new to Simple HTML Dom and i'm just guessing how to select these.

Example it outputs now

Title > Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]

Link > /ep/148425/pokemon-s19e14-dubbed-720p-hdtv-x264-w4f/

Size > Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]

I need it:

Title > Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]

Link > /ep/148425/pokemon-s19e14-dubbed-720p-hdtv-x264-w4f/

Size > 523.10 MB

$html = str_get_html("<td align=\"center\" class=\"forum_thread_post\">
<a href=\"pokemonlink\" rel=\"nofollow\" class=\"download_1\" title=\"Pokemon S19E14 DUBBED HDTV x264-W4F Torrent: Download Mirror #1\"></a>
</td>
<td align=\"center\" class=\"forum_thread_post\">175.32 MB</td>
<td align=\"center\" class=\"forum_thread_post\">4h 23m</td>
<td align=\"center\" class=\"forum_thread_post_end\"><a href=\"/forum/discuss/148426/\" rel=\"nofollow\" title=\"Discuss about Pokemon S19E14 DUBBED HDTV x264-W4F [eztv]:\"><img src=\"/ezimg/s/1/3/chat_empty.png\" border=\"0\" width=\"16\" height=\"16\" alt=\"Discuss\" title=\"Discuss about this show\"/></a></td>
</tr>
<tr name=\"hover\" class=\"forum_header_border\">
<td width=\"35\" class=\"forum_thread_post\"><a href=\"/shows/1253/pokemon/\" title=\"Pokémon Torrent\"><img src=\"/ezimg/s/1/3/show_info.png\" border=\"0\" alt=\"Show\" title=\"Show Description about Pokémon\"></a><a href=\"http://www.tvmaze.com/episodes/762956/pokemon-the-series-xy-19x14-an-explosive-operation\" target=\"_blank\" title=\"More info about Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv] at tvmaze.com\" onclick=\"trackOutboundLink('http://www.tvmaze.com/episodes/762956/pokemon-the-series-xy-19x14-an-explosive-operation'); return false;\"><img src=\"/images/tvmaze-16x16.png\" width=\"16\" height=\"16\" border=\"0\" alt=\"TVmaze\"/></a></td>
<td class=\"forum_thread_post\">
<a href=\"/ep/148425/pokemon-s19e14-dubbed-720p-hdtv-x264-w4f/\" title=\"Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv] (523.10 MB)\" alt=\"Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv] (523.10 MB)\" class=\"epinfo\">Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]</a>
</td>
<td align=\"center\" class=\"forum_thread_post\">
<a href=\"pokemonlink\" rel=\"nofollow\" class=\"download_1\" title=\"Pokemon S19E14 DUBBED 720p HDTV x264-W4F Torrent: Download Mirror #1\"></a>
</td>
<td align=\"center\" class=\"forum_thread_post\">523.10 MB</td>
<td align=\"center\" class=\"forum_thread_post\">4h 23m</td>
<td align=\"center\" class=\"forum_thread_post_end\"><a href=\"/forum/discuss/148425/\" rel=\"nofollow\" title=\"Discuss about Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]:\"><img src=\"/ezimg/s/1/3/chat_empty.png\" border=\"0\" width=\"16\" height=\"16\" alt=\"Discuss\" title=\"Discuss about this show\"/></a></td>
</tr>");

$tdTorrents = $html->find("td[class=forum_thread_post]");
foreach ($tdTorrents as $torrent) {
if(($torrent->find("a[class=epinfo]", 0))) {
$title = $torrent->find("a[class=epinfo]", 0)->innertext;
$link = $torrent->find("a[class=epinfo]", 0)->href;
if (strpos($torrent->innertext, "MB") !== false || strpos($torrent->innertext, "KB") !== false || strpos($torrent->innertext, "GB")!== false) {
$Size = $torrent->innertext;
print "<br>Title > ".$title."</br>
<br>Link > ".$link."</br>
<br>Size > ".$Size."</br>";
}
}

}


Any help is greatful thanks the html am using is down there.

<td align="center" class="forum_thread_post">
<a href="pokemonlink" rel="nofollow" class="download_1" title="Pokemon S19E14 DUBBED HDTV x264-W4F Torrent: Download Mirror #1"></a>
</td>
<td align="center" class="forum_thread_post">175.32 MB</td>
<td align="center" class="forum_thread_post">4h 23m</td>
<td align="center" class="forum_thread_post_end"><a href="/forum/discuss/148426/" rel="nofollow" title="Discuss about Pokemon S19E14 DUBBED HDTV x264-W4F [eztv]:"><img src="/ezimg/s/1/3/chat_empty.png" border="0" width="16" height="16" alt="Discuss" title="Discuss about this show"/></a></td>
</tr>
<tr name="hover" class="forum_header_border">
<td width="35" class="forum_thread_post"><a href="/shows/1253/pokemon/" title="Pokémon Torrent"><img src="/ezimg/s/1/3/show_info.png" border="0" alt="Show" title="Show Description about Pokémon"></a><a href="http://www.tvmaze.com/episodes/762956/pokemon-the-series-xy-19x14-an-explosive-operation" target="_blank" title="More info about Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv] at tvmaze.com" onclick="trackOutboundLink('http://www.tvmaze.com/episodes/762956/pokemon-the-series-xy-19x14-an-explosive-operation'); return false;"><img src="/images/tvmaze-16x16.png" width="16" height="16" border="0" alt="TVmaze"/></a></td>
<td class="forum_thread_post">
<a href="/ep/148425/pokemon-s19e14-dubbed-720p-hdtv-x264-w4f/" title="Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv] (523.10 MB)" alt="Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv] (523.10 MB)" class="epinfo">Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]</a>
</td>
<td align="center" class="forum_thread_post">
<a href="pokemonlink" rel="nofollow" class="download_1" title="Pokemon S19E14 DUBBED 720p HDTV x264-W4F Torrent: Download Mirror #1"></a>
</td>
<td align="center" class="forum_thread_post">523.10 MB</td>
<td align="center" class="forum_thread_post">4h 23m</td>
<td align="center" class="forum_thread_post_end"><a href="/forum/discuss/148425/" rel="nofollow" title="Discuss about Pokemon S19E14 DUBBED 720p HDTV x264-W4F [eztv]:"><img src="/ezimg/s/1/3/chat_empty.png" border="0" width="16" height="16" alt="Discuss" title="Discuss about this show"/></a></td>
</tr>

Answer

Try

$e->next_sibling () Returns the next sibling of element, or null if not found.

Because td with size is the next sibling of the next sibling of td with a.epinfo, you can try:

if(($torrent->find("a[class=epinfo]", 0))) {
    $title = $torrent->find("a[class=epinfo]", 0)->innertext;
    $link = $torrent->find("a[class=epinfo]", 0)->href;
    if (strpos($torrent->innertext, "MB") !== false || strpos($torrent->innertext, "KB") !== false || strpos($torrent->innertext, "GB")!== false) {

        $Size = '';
        // next sibling of next sibling           
        $size_el = $torrent->next_sibling()->next_sibling();
        // make sure it's not NULL
        if ($size_el) {
            $Size = $size_el->innertext;
        }

        print "<br>Title > ".$title."</br>
            <br>Link > ".$link."</br>
            <br>Size > ".$Size."</br>";
    }
}