Barry Barry - 1 year ago 63
Python Question

How do i split the text part of a tweet to have each word of the text in a new column in a CSV file

Outputting tweets to a CSV file and want to separate the text portion to have each word in a new column so i can run it through a classifier using python

for tweet in alltweets:

#Loop to only return the tweets that have been posted in the last 24 hours
if (datetime.datetime.now() - tweet.created_at).days < 1:
# transform the tweepy tweets into a 2D array that will populate the csv
outtweets.append([tweet.user.name, tweet.created_at, tweet.text.encode("utf-8")])

else:
deadend = True
return
if not deadend:
page += 1

# write the csv
with open('%s_tweets.csv' % screen_name, 'w') as f:
writer = csv.writer(f)
writer.writerow(["name", "created_at", "text"])
writer.writerows(outtweets)
pass


** EDIT **
Tweets in CSV

** EDIT 2 **

outtweets.append(list(itertools.chain([tweet.user.name, tweet.created_at],tweet.text.encode("utf-8").split(' '))))
TypeError: a bytes-like object is required, not 'str'

Answer Source

Since tweet.text.encode("utf-8") is one string, you can split it (by space) to convert it into individual words before writing it out.

tweets = [['user1','text of tweet 1'],['user2','text of tweet2']]

import itertools
for tweet in tweets:
    print list(itertools.chain([tweet[0]], tweet[1].split(' ')))

['user1', 'text', 'of', 'tweet', '1']
['user2', 'text', 'of', 'tweet2']

Try this in your code, in place of the current outtweets.append

outtweets.append(list(itertools.chain([tweet.user.name, tweet.created_at],tweet.text.encode("utf-8").split(' ')))

The above code builds two lists, one with all the old attributes and one with the words in the tweet text and then merges them into one list.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download