gbZDB gbZDB - 4 months ago 8
Python Question

split list from text into nGrams in Python

I have to split a text file into a specific amount of words per list in list, probably be best to show in example.

say the text file looks like this

"i am having a good day today"


i have to write a function which looks like this

ngrams.makeNGrams("ngrams.txt", 2)
#so since the given variable says 2 the output should look like this:

[['i', 'am'],['am', 'having'],['having', 'a'],['a',’good’],[’good’, ’day’],[’day’,’today’]]


if the function looked like this

ngrams.makeNGrams("ngrams.txt", 3)

#it should give out:

[[’i’,’am’,’having’],[’having’,’a’,’good’],[’good’,’day’,’today’]]


Does anybody now how i should deal with this best ?
thanks a lot in Advance

Answer

Define:

def ngrams(text, n):
    words = text.split()
    return [ words[i:i+n] for i in range(len(words)-n+1) ]

And use:

s = "i am having a good day today"
ngrams(s, 2)
Comments