Lucas Mähn Lucas Mähn - 1 year ago 66
Python Question

import strings into scrapy to use as crawl urls

So my question is how do I tell scrapy to crawl URLs, which only set apart by one string. So for example:
I got the strings saved in a txt file.

with open("plz_nummer.txt") as f:
cityZIP ='\n')

for a in xrange(0,len(cityZIP)):

next_url = '' + cityZIP[a] + '&txtBranche=&txtKunden='

sal sal
Answer Source

I would make the loading of the file with zip codes part of the start_requests method as a generator. Something in the lines of:

import scrapy

class ZipSpider(scrapy.Spider):
    name = "zipCodes"
    self.city_zip_list = []

    def start_requests(self):
        with open("plz_nummer.txt") as f:
            self.city_zip_list ='\n')
        for city_zip in self.city_zip_list:
            url = '{}&txtBranche=&txtKunden='.format(city_zip)
            yield scrapy.Request(url=url, callback=self.parse)  

    def parse(self, response):
        # Anything else you need
        # to do in here

This should give you a good starting point. Also read this article:

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download