Cesar Cesar - 2 months ago 7
Python Question

Deleting a row if it contains a string in a CSV file

I'm having trouble deleting rows in text file that contains a string in one column. My code so far is not able to delete the row, but it's able to read the text file and save it as a CSV file into separate columns. But the rows are not getting deleted.

This is what the values in that column looks like:

Ship To or Bill To
------------------
3000000092-BILL_TO
3000000092-SHIP_TO
3000004000_SHIP_TO-INAC-EIM


And there are 20 more columns and 50,000k plus rows. So essentially I'm trying to delete all the rows that contain strings
'INAC'
or
'EIM'
.

import csv

my_file_name = "NVG.txt"
cleaned_file = "cleanNVG.csv"
remove_words = ['INAC','EIM']

with open(my_file_name, 'r', newline='') as infile, \
open(cleaned_file, 'w',newline='') as outfile:
writer = csv.writer(outfile)
for line in csv.reader(infile, delimiter='|'):
if not any(remove_word in line for remove_word in remove_words):
writer.writerow(line)

Answer

The problem here is that the csv.reader object returns the rows of the file as lists of individual column values, so the "in" test is checking to see whether any of the individual values in that list is equal to a remove_word.

A quick fix would be to try

        if not any(remove_word in element for element in line for remove_word in remove_words):

because this will be true if any field in the line contains any of the remove_words.