gensim Generating LSI model causes "Python has stopped working"

So I am trying to use gensim to generate an LSI model along with corpus_lsi following this tutorial.

I start with a corpus and a dictionary that I generated myself.
The list of documents are too small (9 lines = 9 documents), which is the sample list provided in gensim tutorials

However, pythos just crashes when it reaches the line for generating LSI_model.
You can see below my code along with the generated output


#!/usr/bin/env python
import os
from gensim import corpora, models, similarities
import logging

#logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)

if __name__ == '__main__':
if (os.path.exists("tmp\dictionary.dict")):
dictionary = corpora.Dictionary.load('tmp\dictionary.dict')
corpus = corpora.MmCorpus('tmp\corpus.mm')
print("Used files generated Dataset Generator")
print("Please run dataset generator")

print ("generating tf-idf model ...")
tfidf = models.TfidfModel(corpus) # Generate tfidf matrix (tf-idf model)
print ("generating corpus_tf-idf model ...")
corpus_tfidf = tfidf[corpus] #use the model to transform vectors

print ("generating LSI model ...")
lsi = models.LsiModel(corpus_tfidf, id2word=dictionary, num_topics=2) # initialize an LSI transformation
print ("generating corpus_lsi model ...")
corpus_lsi = lsi[corpus_tfidf] # create a double wrapper over the original corpus: bow->tfidf->fold-in-lsi



Used files generated Dataset Generator
generating tf-idf model ...
generating corpus_tf-idf model ...
generating LSI model ...

After printing "generating LSI model" it crashes

Any suggestions ?

Other things I tried

  • Changing python version to python 2.6

  • Removing gensim and installing it again from github (instead of conda)

Answer Source

It seems that the issue was the function used in the tutorial (maybe downgraded or something)

so I changed the line

lsi = models.LsiModel(corpus_tfidf, id2word=dictionary, num_topics=2) # initialize an LSI transformation


lsi = LsiModel(corpus_tfidf,num_topics=2)

And it actually worked fine

