Imran Ali Imran Ali - 8 months ago 45
R Question

Is it possible to maintain order of ngrams in the output of textcnt function in R?

I am using the

function from
package to obtain bigrams as follows:

sentence <- "A sample sentence in English for testing purpose"
english <- textcnt(sentence, method = "string", n=2, tolower = FALSE)

bigrams returned are in alphabetic order, like this:

A sample English for for testing in English sample sentence sentence in testing purpose

However I am looking for a solution that could return the bigrams in the order as they appear in sentence. To be more exact the desired output is as follows:

A sample sample sentence sentence in in English English for for testing testing purpose

If it is not possible with
is there an alternate to acheive the desired output?



tokenize_ngrams(sentence, n = 2L)
# [[1]]
# [1] "a sample"        "sample sentence" "sentence in"     "in english"      "english for"     "for testing"     "testing purpose"