windboy windboy - 1 month ago 17
Python Question

ZeroDivisionError , but I can't find the error

I have a small encountered a zero error but I can't find it. My intention is to compare a text file which contains these words.

secondly
pardon
woods
secondly


I wrote the script to compare the two values this way:

secondly, pardon
secondly, woods
secondly, secondly
pardon, woods
pardon, secondly
woods, secondly


My code does the following:

1) if words are the same it will give a score of 1 otherwise it is a score calculated by the gensim vector model
2) there is a counter and the counter will reset when the first for loop moves to the next word. Eg, secondly,pardon > secondly, woods > secondly, secondly ( at this point the count is 3)

The code

from __future__ import division
import gensim


textfile = 'businessCleanTxtUniqueWords'
model = gensim.models.Word2Vec.load("businessSG")
count = 0 # keep track of counter
score = 0
avgScore = 0
SentenceScore = 0
externalCount = 0
totalAverageScore = 0

with open(textfile, 'r+') as f1:

words_list = f1.readlines()

for each_word in words_list:
word = each_word.strip()

for each_word2 in words_list[words_list.index(each_word) + 1:]:
count = count + 1

try:
word2 = each_word2.strip()
print(word, word2)
# if words are the same
if (word == word2):
score = 1
else:
score = model.similarity(word,word2) # when words are not the same
# if word is not in vector model
except KeyError:
score = 0
# to keep track of the score
SentenceScore=SentenceScore + score

print("the score is: " + str(score))
print("the count is: " + str(count))
# average score
avgScore = round(SentenceScore / count,5)

print("the avg score: " + str(SentenceScore) + '/' + str(count) + '=' + str(avgScore))
# reset counter and sentence score
count = 0
SentenceScore = 0


The error message:

Traceback (most recent call last):
File "C:/Users/User/Desktop/Complete2/Complete/TrainedTedModel/LatestJR.py", line 41, in <module>
avgScore = round(SentenceScore / count,5)
ZeroDivisionError: division by zero
('secondly', 'pardon')
the score is: 0.180233083443
the count is: 1
('secondly', 'woods')
the score is: 0.181432347816
the count is: 2
('secondly', 'secondly')
the score is: 1
the count is: 3
the avg score: 1.36166543126/3=0.45389
('pardon', 'woods')
the score is: 0.405021005657
the count is: 1
('pardon', 'secondly')
the score is: 0.180233083443
the count is: 2
the avg score: 0.5852540891/2=0.29263
('woods', 'secondly')
the score is: 0.181432347816
the count is: 1
the avg score: 0.181432347816/1=0.18143


I have included "
from __future__ import division
" for the division but it does not seem to fix it

My files can be found in the following link:

Gensim model:

https://entuedu-my.sharepoint.com/personal/jseng001_e_ntu_edu_sg/_layouts/15/guestaccess.aspx?guestaccesstoken=BlORQpsmI6RMIja55I%2bKO9oF456w5tBLR43XZdVCQIA%3d&docid=00459c024d33d48638508dd331cf73144&rev=1&expiration=2016-11-25T23%3a56%3a48.000Z

Textfile:

https://entuedu-my.sharepoint.com/personal/jseng001_e_ntu_edu_sg/_layouts/15/guestaccess.aspx?guestaccesstoken=7%2b8Nkm9BySPFR0zqD%2fdgUcYOaXREG3%2fycALnMFcv59A%3d&docid=08158c442c3f74970bc8090f253b499f8&rev=1&expiration=2016-11-25T23%3a56%3a01.000Z

Thank you.

Answer

It is because the first for loop has reached the last word and the second for loop will not be executed and so the count equals to zero (reset to zero in last iteration). Just change the first for loop to ignore the last word (since it is not necessary):

for each_word in words_list[:-1]: