sol acyon sol acyon - 4 months ago 18
PHP Question

PHP doesn't let me output the html of certain sites, why?

I'm trying to build a basic web scraper. It works fine for almost any website however some sites I'm unable to scrap, why is this? Here is my code on a site that works (this site):



<!doctype html>
<html lang="en-US">
<body>
<?php
$url ='http://stackoverflow.com/';
$output = file_get_contents($url);
echo $output;
?>
</body>
</html>





When run on my own local host this outputs the content of stackoverflow.com into my site. Here is a site this doesn't work for:



<!doctype html>
<html lang="en-US">
<body>
<?php
$url ='https://www.galottery.com/en-us/home.html';
$output = file_get_contents($url);
echo $output;
?>
</body>
</html>





Instead of loading the site I get this error:


Warning: file_get_contents(https://www.galottery.com/en-us/home.html): failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in C:\xampp\htdocs\projects\QD\webScraping\index.php on line 6


Why does this work for some sites and not for others? I thought this could be because one is a HTTPS site but I've tried this code for others like https://google.com and it works just fine.

I'm using XAMMP to run local PHP.

Answer

It's work;

<?php

$ops =  array(
    'http' => array(
        'method' => "GET",
        'header' => "Accept-language: en\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n" .
                    "Cookie: foo=bar\r\n" . 
                    "User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n"
    )
);

$context = stream_context_create($ops);

echo file_get_contents('https://www.galottery.com/en-us/home.html', false, $context);
Comments