Uasthana Uasthana - 5 months ago 27
Python Question

Scraping JSON arrays nested tags

I am trying to scrape data from a JSON file. I am able to scrape data from some of the tags but few nested tags are giving problem. Following is a sample from the file -

{"orders":[{
"order_id":9000,
"flight_start":"2017-06-15T05:00:00.000Z",
"flight_end":"2017-06-22T05:00:00.000Z",
"spots":[{
"spot_id":7354259,
"spot_length":15}],
"constraints":{
"forbid":[{
"network":"BRVO"},
{"network":"DSE"},
{"network":"ESPN"},
{"network":"DFC"},
{"hours":[2,6],
"days_of_week":["Monday","Tuesday","Thursday","Friday"]},
{"hours":[2,6],
"days_of_week":["Saturday","Sunday"]}],
"allocation":[{
"hours":[6,9],
"impressions":{
"min":0.05,
"max":0.05},
"days_of_week":["Monday","Tuesday","Wednesday","Thursday","Friday"]},{
"hours":[20,0],
"impressions":{"min":0.5,"max":0.5},
"days_of_week":["Monday","Tuesday","Wednesday","Thursday","Friday"]},{
"budget":{
"min":1,
"max":1},
"spot_length":15}]}}]}


I am not able to scrape all values from network tag, it is only returning top value from all the network tabs for each order.

I am using the following code -

import urllib
import json
url = 'http://vw-test.elasticbeanstalk.com/test'
json_obj = urllib.request.urlopen(url).read().decode('UTF-8')
data = json.loads(json_obj)
for i in data["orders"]:
k = i["order_id"]
j = i["flight_start"]
l = i["flight_end"]
m = i ['spots']
for value in m:
a = value["spot_length"]
b = value["spot_id"]
n = i["constraints"]
c = n["forbid"]
d = c[0]
e = d["network"]
print(e)


If any one could help me figure this out I'll be very grateful.

Answer

The json data in your question isn't complete. Making some assumptions, this could work:

for i in data["orders"]:
    k = i["order_id"]
    j = i["flight_start"]
    l = i["flight_end"]
    m = i ['spots']
    for  value in m:
        a = value["spot_length"]
        b = value["spot_id"]
    n = i["constraints"]
    c = n["forbid"]
    d = c[0]
    networks = [d["network"] for d in c if "network" in d]
    print(networks)