Marco Dinatsoli Marco Dinatsoli - 3 years ago 232
Python Question

scrapy item loader return list not single value

I am using scrapy 0.20.

I want to use item loader

this is my code:

l = XPathItemLoader(item=MyItemClass(), response=response)
l.add_value('url', response.url)
l.add_xpath('title',"my xpath")
l.add_xpath('developer', "my xpath")
return l.load_item()


I got the result in the json file. the
url
is a list. The
title
is a list. The
developer
is a list.

How to extract single value instead of the list?

Should I make an item pipeline for that? I hope there is a faster way

Answer Source

You need to set an Input or Output processor. TakeFirst would work perfectly in your case.

There are multiple places where you can define it, e.g. in the Item definition:

from scrapy.item import Item, Field
from scrapy.loader.processors import TakeFirst

class MyItem(Item):
    url = Field(output_processor=TakeFirst())
    title = Field(output_processor=TakeFirst())
    developer = Field(output_processor=TakeFirst())

Or, set a default_output_processor on a XpathItemLoader() instance:

l.default_output_processor = TakeFirst()
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download