user3768495 user3768495 - 1 year ago 462
Python Question

How to debug a scrapy pipeline?

I am following this tutorial to learn how to use scrapy and mongodb together. However, I keep getting these error messages:

[Anaconda2] C:\Users\Segovia\Dropbox\stack>scrapy crawl stack
Traceback (most recent call last):
File "c:\users\segovia\anaconda2\lib\", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "c:\users\segovia\anaconda2\lib\", line 72, in _run_code
exec code in run_globals
File "C:\Users\Segovia\Anaconda2\Scripts\scrapy.exe\", line 9, in <module>
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\", line 108, in execute
settings = get_project_settings()
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\utils\", line 60, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\settings\", line 285, in setmodule
self.set(key, getattr(module, key), priority)
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\settings\", line 260, in set
self.attributes[name].set(value, priority)
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\settings\", line 55, in set
value = BaseSettings(value, priority=priority)
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\settings\", line 91, in __init__
self.update(values, priority)
File "c:\users\segovia\anaconda2\lib\site-packages\scrapy\settings\", line 317, in update
for name, value in six.iteritems(values):
File "c:\users\segovia\anaconda2\lib\site-packages\", line 599, in iteritems
return d.iteritems(**kw)
AttributeError: 'list' object has no attribute 'iteritems'

Can someone tell me what possibly went wrong? Or maybe someone can give me a hint on how to debug it? I've tried the 'parse' method provided on scrapy official documentation but it did not work for me. To debug it, I hope I can use an IDE and 'step-in' these codes and check what is going on in details. Thanks!

The file has these lines in it:

ITEM_PIPELINES = ['stack.pipelines.MongoDBPipeline', ]

MONGODB_SERVER = "localhost"
MONGODB_DB = "stackoverflow"

And I am sure 'mongod' is running in another cmd window.

Answer Source

Let's look at the error:

AttributeError: 'list' object has no attribute 'iteritems'

At this part of your project settings:

ITEM_PIPELINES = ['stack.pipelines.MongoDBPipeline', ]

And at this documentation page.

Scrapy expects ITEM_PIPELINES to be a dictionary and you are giving it a list. Fix it:

ITEM_PIPELINES = {'stack.pipelines.MongoDBPipeline': 300}
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download