thefragileomen thefragileomen - 26 days ago 7
Python Question

Parsing CSV / tab-delimited txt file with Python

I currently have a CSV file which, when opened in Excel, has a total of 5 columns. Only columns A and C are of any significance to me and the data in the remaining columns is irrelevant.

Starting on line 8 and then working in multiples of 7 (ie. lines 8, 15, 22, 29, 36 etc...), I am looking to create a dictionary with Python 2.7 with the information from these fields. The data in column A will be the key (a 6-digit integer) and the data in column C being the respective value for the key. I've tried to highlight this below but the formatting isn't the best:-

A B C D
1 CDCDCDCD
2 VDDBDDB
3
4
5
6
7 DDEFEEF FEFEFEFE
8 123456 JONES
9
10
11
12
13
14
15 293849 SMITH


As per the above, I am looking to extract the value from A7 (DDEFEEF) as a key in my dictionary and "FEFEFEFE" being the respective data and then add another entry to my dictionary, jumping to line 15 with "2938495" being my key and "Smith" being the respective value.

Any suggestions? The source file is a .txt file with entries being tab-delimited.
Thanks

Clarification:

Just to clarify, so far, I have tried the below:-

import csv

mydict = {:}
f = open("myfile", 'rt')
reader = csv.reader(f)
for row in reader:
print row


The above simply prints out all content though a row at a time. I did try "for row(7) in reader" but this returned an error. I then researched it and had a go at the below but it didn't work neither:

import csv
from itertools import islice

entries = csv.reader(open("myfile", 'rb'))
mydict = {'key' : 'value'}

for i in xrange(6):
mydict['i(0)] = 'I(2) # integers representing columns
range = islice(entries,6)
for entry in range:
mydict[entries(0) = entries(2)] # integers representing columns

Answer

Start by turning the text into a list of lists. That will take care of the parsing part:

lol = list(csv.reader(open('text.txt', 'rb'), delimiter='\t'))

The rest can be done with indexed lookups:

d = dict()
key = lol[6][0]      # cell A7
value = lol[6][3]    # cell D7
d[key] = value       # add the entry to the dictionary
 ...