user61629 user61629 - 3 months ago 22
Python Question

TypeError: takes exactly 1 argument (0 given) - Scrapy

I'm working with scrapy. I want to generate a unique user agent for each request. I have the following:

class ContactSpider(Spider):
name = "contact"

def getAgent(self):
f = open('useragentstrings.txt')
agents = f.readlines()
return random.choice(agents).strip()

headers = {
'user-agent': getAgent(),
'content-type': "application/x-www-form-urlencoded",
'cache-control': "no-cache"
}

def parse(self, response):
open_in_browser(response)


getAgent generates an agent from a list of the form:

"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"


However when I run this I get:

File "..spiders\contact_spider.py, line 35, in <module>
class ContactSpider(Spider):
File "..spiders\contact_spider.py", line 54, in ContactSpider
'user-agent': getAgent(),
TypeError: getAgent() takes exactly 1 argument (0 given)

Answer

getAgent() is an instance method and expects to see the ContactSpider instance as an argument. But, the problem is, you don't need this function to be a member of your spider class - move it to a separate "helpers"/"utils"/"libs" module and import:

from helpers import getAgent

class ContactSpider(Spider):
    name = "contact"

    headers = {          
        'user-agent': getAgent(),
        'content-type': "application/x-www-form-urlencoded",
        'cache-control': "no-cache"
    }

    def parse(self, response):
        open_in_browser(response)

See also: Difference between Class and Instance methods.


Or, as an alternative approach, there is a scrapy-fake-user-agent Scrapy middleware that would rotate user agents seamlessly and randomly. User Agent strings are supplied by the fake-useragent module.