Sagar Raj Singh Sagar Raj Singh - 7 months ago 20
Python Question

Extracting only tweets of a particular # hastag

I am using python 3.4 , tweepy API to extract tweets to a text file but instead of only tweets the entire source of the page is being extracted. If there is a way to get the tweets only not entire source code .

Answer

The tweets come in formatted as JSON. So include simple JSON into your script to encode them. (btw this is python 2.7 so you will have to do the print differently)

from tweepy.utils import import_simplejson
json = import_simplejson()

Load each tweet with python, make sure that the data['entities]['hashtags] field is not empty (so it has a hashtag)

data_temp = json.loads(data)
if data_temp['entities']['hashtags']:
    tweet_text = data_temp["text"].encode('utf-8')

then print it out to file

print(tweet_text, file=write_file)

These are only snippets, use this guys script to help you get going and source the tweets. Big ups to him: https://github.com/bwbaugh/twitter-corpus