Aydus-Matthew Aydus-Matthew - 3 months ago 17x
Node.js Question

Use node.js simplecrawler with Angular2 TypeScript Project

Is it possible to use simplecrawler for nodejs with Angluar2 TypeScript project? If so, what's the correct setup?


I've installed the module:

npm install simplecrawler --save

Declared class Crawler in my typescript service:

declare var Crawler: any;

But creating the Crawler object fails:

var crawler = new Crawler('http://www.google.com');

With browser console error:

crawler.js:10 Uncaught ReferenceError: require is not defined

I notice that the following file uses require: node_modules\simplecrawler\lib

var FetchQueue = require("./queue.js"),
CookieJar = require("./cookies.js"),
MetaInfo = require("../package.json");

var http = require("http"),
https = require("https"),
EventEmitter = require("events").EventEmitter,
uri = require("urijs"),
zlib = require("zlib"),
util = require("util"),
iconv = require("iconv-lite");


I am not really sure why you are trying to instantiate your crawler in your Angular2 application, unless you are using Angular2 server side.

Simple crawler is meant to be used server-side not client side.

That being said, I think that you are not instantiating your crawler correctly.

When you do this:

var crawler = new Crawler('http://www.google.com');

You should have something like this:

var Crawler = require("simplecrawler");
var crawler = new Crawler('http://www.google.com');

The issue is trying to use the keyword require in your front end Angular2 application. Again I really don't think you want to do this unless you are making a universal Angular2 app. But if you insist on doing this, you will need to use an import statement for simplecrawler with typescript, instead of require.

import {Crawler} from simplecrawler
var crawler = new Crawler('http://www.google.com');

Something like that should work.