android_dev android_dev - 4 months ago 8
Python Question

Count occurence of key per column in many columns in csv

Have a csv like this

col1,col2,col3
t,t,t
f,f,f
t,f,t


The the file is quite big (50 Mb) with many columns

Need to count the amount of t per column

Tried this:

import csv
import collections

col1 = collections.Counter()
with open('file.csv') as input_file:
for row in csv.reader(input_file, delimiter=','):
col1[row[0]] += 1

print 'Number of t in col1: %s' % col1['t']


But this only counts the first column (col1), how do I count many columns?

Answer
import csv
totals = {}

with open('file.csv') as input_file:
    for row in csv.reader(input_file, delimiter=','):
        for column, cell in enumerate(row):
            if column not in totals:
                totals[column] = 0
            if cell == 't':
                totals[column] += 1

for column in totals:
    print 'column %d has %d trues' % (column, totals[column])
Comments