amc amc - 3 months ago 9
Python Question

Make a list of every column in a file in Python

I would like to create a list for every column in a txt file.
The file looks like this:

NAME S1 S2 S3 S4
A 1 4 3 1
B 2 1 2 6
C 2 1 3 5


PROBLEM 1 . How do I dynamically make the number of lists that fit the number of columns, such that I can fill them? In some files I will have 4 columns, others I will have 6 or 8...

PROBLEM 2. What is a pythonic way to iterate through each column and make a list of the values like this:

list_s1 = [1,2,2]

list_s2 = [4,1,1]


etc.

Right now I have read in the txt file and I have each individual line. As input I give the number of NAMES in a file (here HOW_MANY_SAMPLES = 4)

def parse_textFile(file):

list_names = []
with open(file) as f:
header = f.next()
head_list = header.rstrip("\r\n").split("\t")
for i in f:
e = i.rstrip("\r\n").split("\t")
list_names.append(e)

for i in range(1, HOW_MANY_SAMPLES):
l+i = []
l+i.append([a[i] for a in list_names])


I need a dynamic way of creating and filling the number of lists that correspond to the amount of columns in my table.

Answer

By using pandas you can create a list of list or a dic to get what you are looking for.

Create a dataframe from your file, then iterate through each column and add it to a list or dic.

from StringIO import StringIO
import pandas as pd

TESTDATA = StringIO("""NAME   S1   S2   S3   S4
                        A   1    4   3   1 
                        B   2    1   2   6
                        C   2    1   3   5""")

columns = []
c_dic = {}
df = pd.read_csv(TESTDATA, sep="   ", engine='python')
for column in df:
    columns.append(df[column].tolist())
    c_dic[column] = df[column].tolist()

Then you will have a list of list for all the columns

for x in columns:
    print x

Returns

['A', 'B', 'C']
[1, 2, 2]
[4, 1, 1]
[3, 2, 3]
[1, 6, 5]

and

for k,v in c_dic.iteritems():
    print k,v

returns

S3 [3, 2, 3]
S2 [4, 1, 1]
NAME ['A', 'B', 'C']
S1 [1, 2, 2]
S4 [1, 6, 5]

if you need to keep track of columns name and data