Marco Rimoldi Marco Rimoldi - 4 years ago 80
Python Question

How to find words inside two text files

First part of the script is OK (its removes

http://
and
www.
). Later I need to check if the words inside source are presents in exists.

source = open('/net/sign/temp/python_tmp/script1/source.txt','r')
exists = open('/net/sign/temp/python_tmp/script1/exists.txt','r')

with source as f:
lines = f.read()
lines = lines.replace('http://','')
lines = lines.replace('www.','')

for a in open('/net/sign/temp/python_tmp/script1/exists.txt'):
if a == lines:
print("ok")


The content of
source.txt
:

www.yahoo.it
www.yahoo.com
www.google.com
http://www.libero.it





The content of
exists.txt
:

www.yahoo.com

Answer Source

Something like this should work:

source_words = set()
with open('source.txt') as source:
    for word in source.readlines():
        source_words.add(word.replace('http://','').replace('www.','').strip())

exist_words = set()
with open('exist.txt') as exist:
    for word in exist.readlines():
        exist_words.add(word.replace('http://','').replace('www.','').strip())

print("There {} words from 'source.txt' in 'exists.txt'".format(
   "are" if exist_words.intersection(source_words) else "aren't"
))

If you need to get exact words which are present in both files, they are in the intersection result:

print("These words are in both files:")
for word in exist_words.intersection(source_words):
    print(word)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download