I'm trying to train a classifier to classify text from a chat between 2 users so later on I can predict who of the two users is more likely to say X sentence/word. To get there I mined the text from the chat log and ended up with two arrays of words,
You're asking what ML representation you should use for user-classification of chat text.
bag-of-words and word-vector are the main representations generally used in text-processing. However user-classification of chat is not the usual text-processing task, we look for telltale features indicative of a specific user. Here are some: