Leandro Jimenez Leandro Jimenez - 2 months ago 5
R Question

count the frequency of words after a specific word

I have many tweets as a text.

I would like to know the frequency of words after a specific word.
For instance, I have these tweets and I want to know the frequency after "love":

My love is...
My love is...
the love was...
the love were...


to get this result:

word next word frequency

Love is 2
Love was 1
Love were 1


or to all words

word next word frequency

My Love 2
the love 2
Love is 2
Love was 1
Love were 1

Answer

The following procedure might help.

Step1 (optional): Creating some example data

example <- c("my love is","my love is","banana","apple","the love was","the love were")

This vector looks like

"my love is"    "my love is"    "banana"        "apple"         "the love was"  "the love were"

Step2: Taking all entries of the vector which include the word "love"

ex2 <- example[grep("love",example)]

which gives you

"my love is"    "my love is"    "the love was"  "the love were"

Step3: Constructing a table of the word which comes after the word "love"

ex3 <- table(gsub(".*love","",ex2))

which gives you

   is   was  were 
    2     1     1 
Comments