Tomas Greif Tomas Greif - 1 month ago 11
Python Question

Remove or keep specific columns in csv file

I have a simple script to either remove last n columns from csv file or to keep first n columns only in csv file:

from sys import argv
import csv

if len(argv) == 4:
script, inputFile, outputFile, n = argv
n = [int(i) for i in n.split(",")]
else:
script, inputFile, outputFile = argv
n = 1

with open(inputFile,"r") as fin:
with open(outputFile,"w") as fout:
writer=csv.writer(fout)
for row in csv.reader(fin):
writer.writerow(row[:n])


Example usage (remove last two columns):
removeKeepColumns.py sample.txt out.txt -2


How do I extend this to handle possibility to keep/remove specific set of columns, e.g.:


  • remove columns 3,4,5

  • keep only columns, 1,4,6



I can split input arguments separted by comma into array, but don't know hot to pass this to
writerow(row[])


Links to scripts I used to create my example:


Answer

Elaborating on my comment (Picking out items from a python list which have specific indexes)

from sys import argv
import csv

if len(argv) == 4:
  script, inputFile, outputFile, cols_str = argv
  cols = [int(i) for i in cols_str.split(",")]

with open(inputFile,"r") as fin:
  with open(outputFile,"w") as fout:
    writer=csv.writer(fout)
    for row in csv.reader(fin):
      sublist = [row[x] for x in cols]
      writer.writerow(sublist)

This should (untested) keep all the columns that are given as comma-separated list in the 3rd parameter. To remove the given colums,

sublist = [row[x] for x not in cols]

should do the trick.

Comments