daneshjai daneshjai - 17 days ago 5
Python Question

KeyError when Key exists

Using python and twitter api to get tweet objects.

I have a file (tweetfile = a .txt file on my computer) with tweets and I'm trying to loop through the objects to get the text. I checked the twitter object with tweetObj.keys() to see the keys and 'text' is there; however, when I try to get the individual text using tweetObj['text'] I get the KeyError: 'text'

code:

for line in tweetfile:
tweetObj = json.loads(line)
keys = tweetObj.keys()
print keys
tweet = tweetObj['text']
print tweet


below is the output:

[u'contributors', u'truncated', u'text', u'in_reply_to_status_id', u'id', u'favorite_count', u'source', u'retweeted', u'coordinates', u'entities', u'in_reply_to_screen_name', u'id_str', u'retweet_count', u'in_reply_to_user_id', u'favorited', u'user', u'geo', u'in_reply_to_user_id_str', u'possibly_sensitive', u'lang', u'created_at', u'filter_level', u'in_reply_to_status_id_str', u'place']
@awe5sauce my dad was like "so u wanna be in a relationship with a 'big dumb idiot'" nd i was like yah shes the bae u feel lmao
[u'delete']
Traceback (most recent call last):
File "C:\apps\droid\a1\tweets.py", line 34, in <module>
main()
File "C:\apps\droid\a1\tweets.py", line 28, in main
tweet = tweetObj['text']
KeyError: 'text'


I'm not sure how to approach since it looks like it prints one tweet. The question is why would this occur where the key exists and appears to return a value but not for all instances and how can I correct it to where I can access the value for all lines with that key?

ssm ssm
Answer

There are 2 dictionaries created within the loop, one for each line. The first one has text and the second one only has a 'delete' key. It does not have the 'text' key. Hence the error message.

Change it to:

for line in tweetfile:
    tweetObj = json.loads(line)
    keys =  tweetObj.keys()
    print keys
    if 'text' in tweetObj:
        print tweetObj['text']
    else:
        print 'This does not have a text entry'      

Just so you know, if you are only interested in the lines containing text, you may want to use

[ json.loads(l)['text'] for l in tweetfile if 'text' in json.loads(l) ]

or

'\n'.join([ json.loads(l)['text'] for l in tweetfile if 'text' in json.loads(l) ])

or even BETTER

[ json.loads(l).get('text') for l in tweetfile]

Comments