Abe Abe - 5 months ago 68
Python Question

scrapy: Call a function when a spider quits

Is there a way to trigger a method in a Spider class just before it terminates?

I can terminate the spider myself, like this:

class MySpider(CrawlSpider):
#Config stuff goes here...

def quit(self):
#Do some stuff...
raise CloseSpider('MySpider is quitting now.')

def my_parser(self, response):
if termination_condition:

#Parsing stuff goes here...

But I can't find any information on how to determine when the spider is about to quit naturally.


It looks like you can register a signal listener through dispatcher.

I would try something like:

from scrapy import signals
from scrapy.xlib.pydispatch import dispatcher

class MySpider(CrawlSpider):
    def __init__(self):
        dispatcher.connect(self.spider_closed, signals.spider_closed)

    def spider_closed(self, spider):
      # second param is instance of spder about to be closed.