Muscles Muscles - 2 years ago 69
Python Question

Reading data from one CSV and displaying parsed data on to another CSV file

I am very new to Python. I am trying to read a csv file and displaying the result to another CSV file. What I want to do is I want to write selective rows in the input csv file on to the output file. Below is the code I wrote so far. This code read every single row from the input file i.e. 1.csv and write it to an output file out.csv. How can I tweak this code say for example I want my output file to contain only those rows which starts with READ in column 8 and rows which are not equal to 0000 in column 10. Both of these conditions need to be met. Like start with READ and not equal to 0000. I want to write all these rows. Also this block of code is for a single csv file. Can anyone please tell me how I can do it for say 10000 csv files ? Also when I execute the code, I can see spaces between lines on my out csv. How can I remove those spaces ?

import csv
f1 = open("1.csv", "r")
reader = csv.reader(f1)
header = reader.next()
f2 = open("out.csv", "w")
writer = csv.writer(f2)
writer.writerow(header)
for row in reader:
writer.writerow(row)
f1.close()
f2.close()

Answer Source

Something like:

import os
import csv
import glob

class CSVReadWriter(object):

    def munge(self, filename, suffix):
        name,ext = os.path.split(filename)
        return '{0}{1}.{2}'.format(name, suffix, ext)

    def is_valid(self, row):
        return row[8] == 'READ' and row[10] == '0000'

    def filter_csv(fin, fout):
        reader = csv.reader(fin)
        writer = csv.writer(fout)

        writer.write(reader.next())  # header
        for row in reader:
            if self.is_valid(row):
                writer.writerow(row)

    def read_write(self, iname, suffix):
        with open(iname, 'rb') as fin:
            oname = self.munge(filename, suffix)
            with open(oname, 'wb') as fout:
                self.filter_csv(fin, fout)

work_directory = r"C:\Temp\Data"

for filename in glob.glob(work_directory):
    csvrw = CSVReadWriter()
    csvrw.read_write(filename, '_out')

I've made it a class so that you can over ride the munge and is_valid methods to suit different cases. Being a class also means that you can store state better, for example if you wanted to output lines between certain criteria.

The extra spaces between lines that you mention are to do with \r\n carriage return and line feed line endings. Using open with 'wb' might resolve it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download