ayush singhal ayush singhal - 1 month ago 9
Python Question

how to save json content from a web query

I have this following code for querying and saving the json response. The query returns me a json output but when I save it with my code then I can no longer retain the json format.

file_out='C:\\Users\\ayush488\\Desktop\\annotation_for_new_dataset\\url'+str(cnt)+'.txt'
cnt=cnt+1

response=urllib2.urlopen('http://www.diffbot.com/api/article?token='+token+'&url='+url).read()
with open(file_out,'w') as outfile:
json.dump(response,outfile)


Can anyone tell how to save the json content properly?

Here is the sample of the output:

"{\"icon\":\"http:\\/\\/open.blogs.nytimes.com\\/favicon.ico\",\"author\":

Answer

You are double-encoding the JSON response.

You are receiving a string value, one that json.loads() could turn into Python objects. To save that response to a file, do not encode, do not decode, just save it straight to the file object. The most efficient way would be to use shutil.copyfileobj():

import shutil

response=urllib2.urlopen('http://www.diffbot.com/api/article?token='+token+'&url='+url).read()
with open(file_out,'w') as outfile:
    shutil.copyfileobj(response, outfile)