user61629 user61629 - 1 year ago 107
Python Question

TypeError: takes exactly 1 argument (0 given) - Scrapy

I'm working with scrapy. I want to generate a unique user agent for each request. I have the following:

class ContactSpider(Spider):
name = "contact"

def getAgent(self):
f = open('useragentstrings.txt')
agents = f.readlines()
return random.choice(agents).strip()

headers = {
'user-agent': getAgent(),
'content-type': "application/x-www-form-urlencoded",
'cache-control': "no-cache"
}

def parse(self, response):
open_in_browser(response)


getAgent generates an agent from a list of the form:

"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"


However when I run this I get:

File "..spiders\contact_spider.py, line 35, in <module>
class ContactSpider(Spider):
File "..spiders\contact_spider.py", line 54, in ContactSpider
'user-agent': getAgent(),
TypeError: getAgent() takes exactly 1 argument (0 given)

Answer Source

getAgent() is an instance method and expects to see the ContactSpider instance as an argument. But, the problem is, you don't need this function to be a member of your spider class - move it to a separate "helpers"/"utils"/"libs" module and import:

from helpers import getAgent

class ContactSpider(Spider):
    name = "contact"

    headers = {          
        'user-agent': getAgent(),
        'content-type': "application/x-www-form-urlencoded",
        'cache-control': "no-cache"
    }

    def parse(self, response):
        open_in_browser(response)

See also: Difference between Class and Instance methods.


Or, as an alternative approach, there is a scrapy-fake-user-agent Scrapy middleware that would rotate user agents seamlessly and randomly. User Agent strings are supplied by the fake-useragent module.