Flame_Phoenix Flame_Phoenix - 4 months ago 22
HTML Question

NodeJS HttpGet method not working on Wiki

Objective



Download the HTML of a Wiki Page.

Background



I am trying to download the HTML of a Wiki page (http://warframe.wikia.com/wiki/Mods_2.0) to parse for information. To achieve this I am using NodeJS and I am using its HTTP Request methods.

Code



I have a very simple code file which merely accesses the website and tries to print its contents:

"use strict";

var http = require("http");

var options = {
host: "http://warframe.wikia.com",
port: 80,
path: 'wiki/Mods_2.0',
method: "GET"
};

var req = http.request(options, function(res) {

console.log("STATUS: " + res.statusCode);
console.log("HEADERS: " + JSON.stringify(res.headers));
res.setEncoding('utf8');

res.on("data", function (chunk) {
console.log("BODY: " + chunk);
});
});

req.end();


Problem



The problem is that no matter what I do, nor what I try, I always get the following error output:

Debugger listening on port 15454 events.js:141
throw er; // Unhandled 'error' event
^

Error: getaddrinfo ENOTFOUND http://warframe.wikia.com http://warframe.wikia.com:80
at errnoException (dns.js:27:10)
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:78:26)


Process exited with code: 1


I am fairly sure that I am building the URL incorrectly, but somehow I can't understand how to fix this!

What I tried



My approach is based on the contents this discussion In Node.js / Express, how do I "download" a page and gets its HTML?.

I tried several combinations of the URL path in the
options
variable, only to get different versions of the same error.

I also read In Node.js / Express, how do I "download" a page and gets its HTML?, however that discussion has a different problem (it focuses on streaming, which is not my objective).

Questions



1 - I am fairly sure this is a simple error but I cannot see it. What am I missing?

Answer

Remove the http in the url and add / in the path:

"use strict";

var http = require("http");

var options = {
  host: "warframe.wikia.com",
  port: 80,
  path: '/wiki/Mods_2.0',
  method: "GET"
};

var req = http.request(options, function(res) {

  console.log("STATUS: " + res.statusCode);
  console.log("HEADERS: " + JSON.stringify(res.headers));
  //res.setEncoding('utf8');

  res.on("data", function (chunk) {
    console.log("BODY: " + chunk);
  });
});

req.end();
Comments