Mark Mark - 3 months ago 11
JSON Question

how to read in a JSON file as separate strings inside a list rather than as one big list

I am trying to read in a JSON file that looks like this. They are the timestamps of tweets. When I read in the file with my code, it comes in as one big string. Is there a way to have them separated. When I use str.split() then it splits everything. Is there a was that I can load it in or take it out to make this easiser

"Sat Aug 06 23:54:24 +0000 2016""Sat Aug 06 23:54:24 +0000 2016""Sat Aug 06 23:54:24 +0000 2016""Sat Aug 06 23:54:24 +0000 2016"


Heres how I am reading it in

q = 'Trump'

twitter_stream = twitter.TwitterStream(auth=twitter_api.auth)

stream = twitter_stream.statuses.filter(track=q)

for tweet in stream:
print (type(tweet))
tweet = tweet['created_at']
with open('dates.json', 'a') as outfile:
json.dump(tweet, outfile, indent=4)


and here is how I am currently attempting to get it out

with open('dates.json', 'rb') as f:
data = f.readlines()


I want them to be separated by date so i can covert them to make a time series graph

EDIT/UPDATE: Now I have this, but the stream just continously collects tweets without stopping. How do I get it to stop collecting the tweets and dump the JSON data into the file. Whethere manually or automatically

q = 'Trump'

twitter_stream = twitter.TwitterStream(auth=twitter_api.auth)

stream = twitter_stream.statuses.filter(track=q)



dates = [tweet['created_at'] for tweet in stream]
with open('dates.json', 'a') as outfile:
json.dump(dates, outfile, indent=4)

Answer

Collect tweet dates into a list and then dump once:

dates = [tweet['created_at'] for tweet in stream]
with open('dates.json', 'a') as outfile:
     json.dump(dates, outfile, indent=4)

With this, how do I get it to stop streaming and dump into the file. Before since it was dumping tweet by tweet I would just restart the shell.

I think you should expand the comprehension to a regular loop and put it into a try/finally:

dates = []
try:
    for tweet in stream:
       dates.append(tweet['created_at'])
finally:
    with open('dates.json', 'a') as outfile:
         json.dump(dates, outfile, indent=4)
Comments