Syed Muddasir Syed Muddasir - 1 year ago 52
HTML Question

Data Extraction using HTML DOM

I am having an issue regarding data extraction, I have also seen a lot of topic regarding this issue but I am not able to find any solution that meet my requirements so I request you to please help me in this error.

<?php
require('admin/inc/simple_html_dom.php');

$html = file_get_contents("http://health.hamariweb.com/rawalpindi/doctors");

$title = $html->find("div#infinite-grid-images", 0)->innertext;

echo $title;

?>


I want to show all these doctors to my website I am just learning data extraction and I have seen a lot of tutorials but still no result, please anyone who can help me :(

Answer Source

Try loading the string returned by file_get_content().

<?php 
    require('admin/inc/simple_html_dom.php');
    $html = file_get_contents("http://health.hamariweb.com/rawalpindi/doctors");
    $dom = new simple_html_dom();
    $dom->load($html);
    $title = $dom->find("#infinite-grid-images", 0)->innertext;

    echo $title;

?>

Also, shipped within the simple_html_dom.php file is a function called: file_get_html($url)

You can do something like:

<?php 
    require('admin/inc/simple_html_dom.php');
    $html = file_get_html("http://health.hamariweb.com/rawalpindi/doctors");
    if($html){
        $title = $dom->find("#infinite-grid-images", 0)->innertext;

        echo $title;
    }else{
        echo "Nothing found";
    }
?>

Good luck!

Also curl is your friend.

<?php
    require('simple_html_dom.php');
    $curl = curl_init();
    curl_setopt_array($curl, array(
        CURLOPT_URL => "http://health.hamariweb.com/rawalpindi/doctors",
        CURLOPT_RETURNTRANSFER => 1,
        CURLOPT_FOLLOWLOCATION => 1,
        CURLOPT_ENCODING => "",
        CURLOPT_MAXREDIRS => 10,
        CURLOPT_TIMEOUT => 30,
        CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
        CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    ));
    $file = curl_exec($curl);
    $error = curl_error($curl);
    curl_close($curl);
    $dom = new simple_html_dom();
    $dom->load($file);
    $doctorDivs = $dom->find("#infinite-grid-images", 0)->children();
    $doctors = array();
    foreach($doctorDivs as $div){
        $doctor = array();
        $doctor["image"] = "http://health.hamariweb.com/".$div->find('img', 0)->src;
        $details = $div->find('table', 1)->find("tr");
        $doctor["name"] = trim($details[0]->plaintext);
        $doctor["type"] = trim($details[1]->plaintext);
        $doctor["etc"] = trim($details[2]->plaintext);
        $doctors[] = $doctor;
    }
echo "<pre>";
var_dump($doctors);
?>

You can decide what to do with the data.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download