Thomas Kingaroy Thomas Kingaroy -3 years ago 126
JSON Question

Scraping a JSON response with Scrapy

How do you use Scrapy to scrape web requests that return JSON? For example, the JSON would look like this:

{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}


I would be looking to scrape specific items (e.g.
name
and
fax
in the above) and save to csv.

Answer Source

It's the same as using Scrapy's HtmlXPathSelector for html responses. The only difference is that you should use json module to parse the response:

class MySpider(BaseSpider):
    ...


    def parse(self, response):
         jsonresponse = json.loads(response.body_as_unicode())

         item = MyItem()
         item["firstName"] = jsonresponse["firstName"]             

         return item

Hope that helps.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download