Mirabilis Mirabilis - 1 year ago 102
HTTP Question

Node.js request module getting modern version of website

Often when making a GET request with the

module in Node.js, the oldest version of the website's HTML is returned.

For example, a very old version of Google is returned when making a request to http://google.com. On the other hand, accessing Google in a browser returns a much more modern version of the website.

I suspect that it related to the device/browser information accessed by sites like Google.
doesn't send any device information (from what I know).

Is there any way to trick sites into thinking that the are being accessed by an actual device/browser (and a modern one too)?

Answer Source

By default, the request package does not include any device information (As the question mentions). Big sites like google use this information to suit aspects of the page like HTML version, CSS/JS features. Newer user-agent means the page can use more and newer features. To emulate any specific device (To debug a mobile page, for instance), pick the appropriate user-agent at useragentstring.com.

Some other headers like accept and accept-encoding can also affect this (Doc here).

Try this code (taken from the docs):

var request = require('request');

var options = {
  url: 'https://google.com',
  headers: {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'

function callback(error, response, body)

request(options, callback);
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download