david camry david camry - 3 months ago 37
PHP Question

how to get exact img src in xpath

if you inspect the page to get img src, you'll see sth like this:

/images/March/img1.jpeg
. but as you know that's not a real address. I want to scrape this page and get the proper src value. how can I do that?
thx in advance.

<?php
$content=file_get_content('example.com');
$dom= new DOMDocument();
$dom->loadHTML($content);
$xpath=new DOMXpath();
$img=$xpath->query("(//img)[2]/@src");
foreach($img as $val){
$images=$val->nodeValue;//just returns img/march/img1.jpeg
//instead of www.example.com/img.....
}
?>

Answer

You have to make Absolute path manually like this:

<?php

$content = file_get_contents('example.com');

$dom     = new DOMDocument();
$dom->loadHTML($content);

$xpath = new DOMXpath();
$img   = $xpath->query("(//img)[2]/@src");

// Make Absolute Url
function getAbsUrl($value, $baseurl)
{
    $Parsed = parse_url($value);

    if (empty($Parsed['host'])) {
        // Relative
        return rtrim($baseurl, '/') . '/' . ltrim($Parsed['path'], '/');
    } else {
        return $value;
    }
}

foreach ($img as $val) {
    $images = getAbsUrl($val->nodeValue, 'http://www.example.com/');
}