Tom Tom - 6 months ago 10
PHP Question

PHP-Retrieve specific content from multiple pages of a website

What I want to accomplish might be a little hardcore, but I want to know if it's possible:

The question:

My question is the same as PHP-Retrieve content from page, but I want to use it on multiple pages.

The situation:

I'm using a website about TV shows. All the TV shows have the same URL and then the name of the show:

http://bierdopje.com/shows/NAME_OF_SHOW

On every show page, there's a line which tells you if the show is cancelled or still running. I want to retrieve that line to make an overview of the cancelled shows (the website only supports an overview of running shows, so I want to make an extra functionality).

The real question:

How can I tell DOM to retrieve all the shows and check for the status of the show?
(http://bierdopje.com/shows/*).

The Note:

I understand that this process may take a while because it is reading the whole website (or is it too much data?).

k3z k3z
Answer

I use phpquery to fetch data from a web page, like jQuery in Dom.

For example, to get the list of all shows, you can do this :

<?php
require_once 'phpQuery/phpQuery/phpQuery.php';

$doc = phpQuery::newDocumentHTML(
    file_get_contents('http://www.bierdopje.com/shows')
);

foreach (pq('.listing a') as $key => $a) {

    $url = pq($a)->attr('href'); // will give "/shows/07-ghost"
    $show = pq($a)->text(); // will give "07 Ghost"

} 

Now you can process all shows individualy, make a new phpQuery::newDocumentHTML for each show and with an selector extract the information you need.


Get the status of a show

$html = file_get_contents('http://www.bierdopje.com/shows/alcatraz');
$doc = phpQuery::newDocumentHTML($html);

$status = pq('.content>span:nth-child(6)')->text();