Jeff Engler Jeff Engler - 11 months ago 36
PHP Question

file_get_contents script works with some websites but not others

I'm looking to build a PHP script that parses HTML for particular tags. I've been using this code block, adapted from this tutorial:

$data = file_get_contents('');
$regex = '/<title>(.+?)</';
echo $match[1];

The script works with some websites (like google, above), but when I try it with other websites (like, say, freshdirect), I get this error:

"Warning: file_get_contents( [function.file-get-contents]: failed to open stream: HTTP request failed!"

I've seen a bunch of great suggestions on StackOverflow, for example to enable
in php.ini. But (1) my version of php.ini didn't have
in it, and (2) when I added it to the extensions section and restarted the WAMP server, per this thread, still no success.

Would someone mind pointing me in the right direction? Thank you very much!

$html = file_get_html('');
$title = $html->find('title')->innertext;

Or if you prefer with preg_match and you should be really using cURL instead of fgc...

function curl($url){

    $headers[]  = "User-Agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20101203 Firefox/3.6.13";
    $headers[]  = "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
    $headers[]  = "Accept-Language:en-us,en;q=0.5";
    $headers[]  = "Accept-Encoding:gzip,deflate";
    $headers[]  = "Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.7";
    $headers[]  = "Keep-Alive:115";
    $headers[]  = "Connection:keep-alive";
    $headers[]  = "Cache-Control:max-age=0";

    $curl = curl_init();
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($curl, CURLOPT_ENCODING, "gzip");
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
    $data = curl_exec($curl);
    return $data;


$data = curl('');
$regex = '#<title>(.*?)</title>#mis';
echo $match[1];