HelloWorld4382 HelloWorld4382 - 1 year ago 51
Python Question

trouble with analyzing words in one file and checking if they are in each line of another file &… in python

So, im trying to search to see if each of the lines in a file2.txt contain any of the words in file1.txt 1. so if for example:

File 1:


file 2 : a bunch of sentences I want to see if it contains any of file1 (over 200 lines)

I have a way of doing this with my own files in my program, and it works but it adds the total values into one big list (like if the entire file says love 43 times, then Love:43, but I'm looking for separate lists for each line.. so if a line contains love 4 times and another 5 times then the program will indicate this.. **specifically, what I'm trying to do is total the amount of keywords in each line of the file (so if a line contains 4 keywords then the total for that line is 4, and the value associated with the keywords (so you see how in my example file one there a value associated with the keywords? If a line in a file is:
Hi I love my boyfriend but I like my bestfriend lol
then this like would be
{Love: 1, like: , lol:1}(keywords = 3, Total = 25
( the total comes from the values associated with them in the list)

and if a second line is simply

I hate my life. It is the worst day ever!

then this would be
{hate: 1, worst: 1}(keywords = 2, total = 2

I have this, and it works, but is there a way to modify it so instead of printing one big line like:

{'please': 24, 'worst': 40, 'regrets': 1, 'hate': 70,... etc,} it simply adds the total number of keywords per line and the values associated with them?

wordcount = {}
with open('mainWords.txt', 'r') as f1, open('sentences.txt', 'r') as f2:
words = f1.read().split()
wordcount = { word.split(',')[0] : 0 for word in words}

for line in f2:
line_split = line.split()
for word in line_split:
if word in wordcount:
wordcount[word] += 1


Answer Source

As usual, collections save the day:

from collections import Counter

with open('mainWords.txt') as f:
    sentiments = {word: int(value)
                 for word, value in
                 (line.split(",") for line in f)

with open('sentences.txt') as f:
    for line in f:
        values = Counter(word for word in line.split() if word in sentiments)
        print(sum(values[word]*sentiments[word] for word in values))  # total
        print(len(values))  # keywords

You have the sentiment polarities in the dictionary sentiments for later use.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download