Barry Barry - 1 year ago 56
Python Question

How do i split the text part of a tweet to have each word of the text in a new column in a CSV file

Outputting tweets to a CSV file and want to separate the text portion to have each word in a new column so i can run it through a classifier using python

for tweet in alltweets:

#Loop to only return the tweets that have been posted in the last 24 hours
if ( - tweet.created_at).days < 1:
# transform the tweepy tweets into a 2D array that will populate the csv
outtweets.append([, tweet.created_at, tweet.text.encode("utf-8")])

deadend = True
if not deadend:
page += 1

# write the csv
with open('%s_tweets.csv' % screen_name, 'w') as f:
writer = csv.writer(f)
writer.writerow(["name", "created_at", "text"])

** EDIT **
Tweets in CSV

** EDIT 2 **

outtweets.append(list(itertools.chain([, tweet.created_at],tweet.text.encode("utf-8").split(' '))))
TypeError: a bytes-like object is required, not 'str'

Answer Source

Since tweet.text.encode("utf-8") is one string, you can split it (by space) to convert it into individual words before writing it out.

tweets = [['user1','text of tweet 1'],['user2','text of tweet2']]

import itertools
for tweet in tweets:
    print list(itertools.chain([tweet[0]], tweet[1].split(' ')))

['user1', 'text', 'of', 'tweet', '1']
['user2', 'text', 'of', 'tweet2']

Try this in your code, in place of the current outtweets.append

outtweets.append(list(itertools.chain([, tweet.created_at],tweet.text.encode("utf-8").split(' ')))

The above code builds two lists, one with all the old attributes and one with the words in the tweet text and then merges them into one list.