TigerRedMike TigerRedMike - 5 months ago 39
Python Question

Python and CSV; how to truncate all values in a column?

Given a simple CSV file like this:

Django,Gunslinger,101-707
KingSchultz,Dentist,205-707
Tatum,Marshall,615-707
Broomhilda,Wife,910-707
...,...,...


How do you truncate all the values in the last column so that only the first three digits remain? (unrelated: so they can be used in math operations)

Desired CSV:

Django,Gunslinger,101
KingSchultz,Dentist,205
Tatum,Marshall,615
Broomhilda,Wife,910
...,...,...


Here is what I have tried so far:

import csv
import re
r = csv.reader(open(input.csv))
for row in r:
re.sub('\-.*', '', row[3])
writer = csv.writer(open('output.csv', 'w'))
writer.writerow(row)


I've verified the
regex
in
re.sub
works correctly. Have tried dozens of variations, many hours searching, but cannot get the desired output.

Answer

Without using re module,

import csv

r = csv.reader(open("sample.csv", "rb"))
writer = csv.writer(open("output.csv", "wb"))

for row in r:
    row[2] = row[2][:3]
    writer.writerow(row)

As @TigerRedMike pointed out in Python 3.X, instead of 'rb' and 'wb', 'r' and 'w' should be used respectively to read and write the files.

Comments