Jack-Jack Jack-Jack - 3 years ago 140
Python Question

How to append a string in a specific sets of string in a file

So I am tagging a corpus and I want to label every stop-words as NOTRELATED. I tried to do it python, but its not working btw Im just a new in python.

stop_words = set(stopwords.words('english'))
for line in word_tokenize(input_file):
if stop_words in line:
line = line + " NOTRELATED\n"
output_file.write(line)


Sample Input(text file):

The

cost

of

damage

to

agriculture

and

infrastructure

in

areas

devastated

by

Typhoon

Lando

has
soared

to

more

than

P6.3

billion

.

Output(file):

The

cost

of NOTRELATED

damage

to NOTRELATED

agriculture

and NOTRELATED

infrastructure

in NOTRELATED

areas

.

.
.

Answer Source

There are a couple of issues. The first is that you should be checking to see if words from the input file are in the stop words set, not the other way around. So

if stop_words in line:

should be:

if line in stop_words:

The rest looks mostly like an indentation issue. Instead of writing line to the file after the for loop completes, write to the file during the loop. And word would be a better choice than line:

stop_words = set(stopwords.words('english'))
for word in word_tokenize(input_file):
    print(word, 'NOTRELATED' if word in stop_words else '', file=output_file)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download