zsljulius zsljulius - 5 months ago 180
Python Question

Difficulty importing .dat file

I am somehow having difficulty reading in this file into python with pandas read_table function.
http://www.ssc.wisc.edu/~bhansen/econometrics/invest.dat

This is my code:

pd.read_table(f,skiprows=[0], sep="")


Which yields error:

TypeError: ord() expected a character, but string of length 0 found

Answer

Dont know about read_table, but you can read this file directly as follows:

import pandas as pd    

with open('/tmp/invest.dat','r') as f:
    next(f) # skip first row
    df = pd.DataFrame(l.rstrip().split() for l in f)

print(df)

Prints:

              0            1             2            3
0     17.749000   0.66007000    0.15122000   0.33150000
1     3.9480000   0.52889000    0.11523000   0.56233000
2     14.810000    3.7480300    0.57099000   0.12111000
...
...

The same can be obtained as follows:

df = pd.read_csv('/tmp/invest.dat', sep='\s+', header=None, skiprows=1)
Comments