Vyraj Vyraj - 5 months ago 23
Python Question

pandas column name assignment read_csv

I have a csv file as follows:

0 5
1 10
2 15
3 20
4 25


I want to save it as a dataframe with x,y axes as names, then plot it. However when I assign
x
,
y
I get a messed up DataFrame, what is happening?

column_names = ['x','y']
x = pd.read_csv('csv-file.csv', header = None, names = column_names)
print(x)

x y
0 0 5 NaN
1 1 10 NaN
2 2 15 NaN
3 3 20 NaN
4 4 25 NaN


I've tried without specifying
None
for
header
, to no avail.

Answer

Add parameter sep="\s+" or delim_whitespace=True to read_csv:

import pandas as pd
import io

temp=u"""0 5
1 10
2 15
3 20
4 25"""
#after testing replace io.StringIO(temp) to filename
column_names = ['x','y']
df = pd.read_csv(io.StringIO(temp), sep="\s+", header = None, names = column_names)

print (df)
   x   y
0  0   5
1  1  10
2  2  15
3  3  20
4  4  25

Or:

column_names = ['x','y']
df = pd.read_csv(io.StringIO(temp),
                 delim_whitespace=True, 
                 header = None, 
                 names = column_names)

print (df)
   x   y
0  0   5
1  1  10
2  2  15
3  3  20
4  4  25