oliver Bergman oliver Bergman - 1 month ago 9
Python Question

View the 7 most common words found in the text, but sorting out the words that are common words

really would need some help to solve this or if someone could point me in the right way, thanks!

View the 7 most common words found in the text, but sorting out the words that are common words. You can find a list of common words in common-words.txt.

common-words.txt = lots of different words.

first i have found the 7 most common words in the text, this is how my code looks like.

print("The 7 most frequently used words is:")
print("\n")

import re
from collections import Counter

with open("alice-ch1.txt") as f:
passage = f.read()

words = re.findall(r'\w+', passage)

cap_words = [word.upper() for word in words]

word_counts = Counter(cap_words).most_common(7)

print(word_counts)


this works and i get the output:

[('THE', 93), ('SHE', 80), ('TO', 75), ('IT', 67), ('AND', 65), ('WAS', 53), ('A', 52)]


now i want to compare theese two text files, if any of the word in my TEXTFILE.TXT is in COMMON-WORDS.TXT i want it removed from the answer.

i have tried to run it with this code:

dic_no_cw = dict(word_counts)
with open("common-words.txt", 'r') as cw:
commonwords = list(cw.read().split())
for key, value in list(dic_no_cw.items()):
for line in commonwords:
if key == line:
del dic_no_cw[key]

dict_copy = dict(dic_no_cw)

dic_no_cw7 = Counter(dic_no_cw).most_common(7)
sorted(dic_no_cw7)

print(dic_no_cw7)


and i get the same output:

[('THE', 93), ('SHE', 80), ('TO', 75), ('IT', 67), ('AND', 65), ('WAS', 53), ('A', 52)]


could really use som help to solve this or some help so i maybe can figure it out by myself.

thanks,

Answer

Can you try with replacing these lines of your code:

for line in commonwords:
    if key == line:
        del dic_no_cw[key]

with

for line in commonwords:
    if key.strip() == line.upper().strip():
        del dic_no_cw[key]
        break
Comments