l.g.karolos l.g.karolos - 17 days ago 4
PHP Question

I need help extracting football standings from website using php

I need to fetch the HTML table using PHP. How can I do it?

References:

Table to be parsed: http://epsachaias.gr/?page_id=221&cat=4&group=151

I tried this:

<div class="row">
<div class="col-md-8">
<h1>Standings</h1>
<?php
$html = file_get_contents('http://epsachaias.gr/?page_id=221&cat=4&group=206');

$dom = new DOMDocument();
$internalErrors = libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_use_internal_errors($internalErrors);

$tables = $dom->getElementsByTagName('table');
$trs = $tables->item(1)->getElementsByTagName('tr');

$output = [];
for ($itr = 0; $itr < $trs->length; $itr++)
{
$tds = $trs->item($itr)->getElementsByTagName('td');
if ($tds->length == 22)
{
$output[] = new team($tds);
}
}

var_dump($output);
?>


But I get the following error:


Fatal error: Class 'team' not found.


Then I tried to add this code:

class stats
{
private $n;
private $i;
private $or;
private $cplus;
private $cminus;

public function __construct($n, $i, $or, $cplus, $cminus)
{
$this->n = $n;
$this->i = $i;
$this->or = $or;
$this->cplus = $cplus;
$this->cminus = $cminus;
}

function getN()
{
return $this->n;
}

function getI()
{
return $this->i;
}

function getOr()
{
return $this->or;
}

function getCplus()
{
return $this->cplus;
}

function getCminus()
{
return $this->cminus;
}
}

class team
{
private $position;
private $name;
private $score;
private $ag;
private $dk;
private $together;
private $within;
private $out;
private $penalties;

public function __construct(DOMNodeList $nodes)
{
if ($nodes->length == 22) {
$this->position = (int) $nodes->item(0)->textContent;
$this->name = $nodes->item(2)->textContent;
$this->score = (int) $nodes->item(3)->textContent;
$this->ag = (int) $nodes->item(4)->textContent;
$this->dk = (int) $nodes->item(5)->textContent;

$this->together = new stats((int)$nodes->item(6)->textContent,
(int)$nodes->item(7)->textContent,
(int)$nodes->item(8)->textContent,
(int)$nodes->item(9)->textContent,
(int)$nodes->item(10)->textContent);

$this->within = new stats( (int)$nodes->item(11)->textContent,
(int)$nodes->item(12)->textContent,
(int)$nodes->item(13)->textContent,
(int)$nodes->item(14)->textContent,
(int)$nodes->item(15)->textContent);

$this->out = new stats( (int)$nodes->item(16)->textContent,
(int)$nodes->item(17)->textContent,
(int)$nodes->item(18)->textContent,
(int)$nodes->item(19)->textContent,
(int)$nodes->item(20)->textContent);

$this->penalties = (int) $nodes->item(21)->textContent;
} else {
throw new Exception("Incorrect input data");
}
}

public function getPosition()
{
return $this->position;
}

public function getName()
{
return $this->name;
}

public function getScore()
{
return $this->score;
}

public function getAg()
{
return $this->ag;
}

public function getDk()
{
return $this->dk;
}

public function getTogether()
{
return $this->together;
}

public function getWithin()
{
return $this->within;
}

public function getOut()
{
return $this->out;
}

public function getPenalties()
{
return $this->penalties;
}
}


But again I am getting this error .

Answer

You have to be aware that this solution is based on the current structure of that site.

You can use DOMDocument object to parse HTML that you can get by file_get_contents like in this code

$html = file_get_contents('http://epsachaias.gr/?page_id=221&cat=4&group=151');

$dom = new DOMDocument();
$internalErrors = libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_use_internal_errors($internalErrors);

$tables = $dom->getElementsByTagName('table');
$trs = $tables->item(1)->getElementsByTagName('tr');

$output = [];
for ($itr = 0; $itr < $trs->length; $itr++) {
    $tds = $trs->item($itr)->getElementsByTagName('td');

    if ($tds->length == 22) {
        $row = [];
        for ($itd = 0; $itd < $tds->length; $itd++) {
            $row[] = $tds->item($itd)->textContent;
        }
        $output[] = $row;
    }
}

var_dump($output);

The $output is your final array.

If you want to provide a better access for your parsed data then you can use some class and objects. For example you can prepare one class for statistic group and another for the team like in this code

class stats
{
    private $n;
    private $i;
    private $or;
    private $cplus;
    private $cminus;

    public function __construct($n, $i, $or, $cplus, $cminus)
    {
        $this->n = $n;
        $this->i = $i;
        $this->or = $or;
        $this->cplus = $cplus;
        $this->cminus = $cminus;
    }

    function getN()
    {
        return $this->n;
    }

    function getI()
    {
        return $this->i;
    }

    function getOr()
    {
        return $this->or;
    }

    function getCplus()
    {
        return $this->cplus;
    }

    function getCminus()
    {
        return $this->cminus;
    }
}

class team
{
    private $position;
    private $name;
    private $score;
    private $ag;
    private $dk;
    private $together;
    private $within;
    private $out;
    private $penalties;

    public function __construct(DOMNodeList $nodes)
    {
        if ($nodes->length == 22) {
            $this->position = (int) $nodes->item(0)->textContent;
            $this->name = $nodes->item(2)->textContent;
            $this->score = (int) $nodes->item(3)->textContent;
            $this->ag = (int) $nodes->item(4)->textContent;
            $this->dk = (int) $nodes->item(5)->textContent;

            $this->together = new stats((int)$nodes->item(6)->textContent, 
                                        (int)$nodes->item(7)->textContent, 
                                        (int)$nodes->item(8)->textContent, 
                                        (int)$nodes->item(9)->textContent, 
                                        (int)$nodes->item(10)->textContent);

            $this->within = new stats(  (int)$nodes->item(11)->textContent,
                                        (int)$nodes->item(12)->textContent, 
                                        (int)$nodes->item(13)->textContent, 
                                        (int)$nodes->item(14)->textContent, 
                                        (int)$nodes->item(15)->textContent);

            $this->out = new stats(     (int)$nodes->item(16)->textContent,
                                        (int)$nodes->item(17)->textContent, 
                                        (int)$nodes->item(18)->textContent, 
                                        (int)$nodes->item(19)->textContent, 
                                        (int)$nodes->item(20)->textContent);

            $this->penalties = (int) $nodes->item(21)->textContent;
        } else {
            throw new Exception("Incorrect input data");
        }
    }

    public function getPosition()
    {
        return $this->position;
    }

    public function getName()
    {
        return $this->name;
    }

    public function getScore()
    {
        return $this->score;
    }

    public function getAg()
    {
        return $this->ag;
    }

    public function getDk()
    {
        return $this->dk;
    }

    public function getTogether()
    {
        return $this->together;
    }

    public function getWithin()
    {
        return $this->within;
    }

    public function getOut()
    {
        return $this->out;
    }

    public function getPenalties()
    {
        return $this->penalties;
    }
}

Because I guess thay you only want to read parsed data then I provided only read operation for all of the object properties. If you want to change them then ofcourse you have add setters or make properties public (and delete unwanted methods).

To preapre a collection with parsed data you can use this code

$html = file_get_contents('http://epsachaias.gr/?page_id=221&cat=4&group=151');

$dom = new DOMDocument();
$internalErrors = libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_use_internal_errors($internalErrors);

$tables = $dom->getElementsByTagName('table');
$trs = $tables->item(1)->getElementsByTagName('tr');

$output = [];
for ($itr = 0; $itr < $trs->length; $itr++) {
    $tds = $trs->item($itr)->getElementsByTagName('td');

    if ($tds->length == 22) {
        $output[] = new team($tds);
    }
}

var_dump($output);

Before in $output you had a simple array with values - now you have a collection of objects. If you for example want to get the ΟΜΑΔΑ of the second team with value in the ΣΥΝΟΛΟ -> Ν cell then you can simply use this code

echo "{$output[1]->getName()}: {$output[1]->getTogether()->getN()}";

and on your output you will get

Άνω Καστρίτσι: 21

Comments