Philipp Schwarz Philipp Schwarz - 6 months ago 21
Python Question

Indexing and selecting data from read_csv pandas df

I have constructed a matrix with integer values for columns and index. The matrix is acutally hierachical for each month. My problem is that the indexing and selecting of data does not work anymore as before when I write the data to csv and then load as pandas dataframe.

Selecting data before writing and reading data to file:

matrix.ix[1][4][3]
would for example give
123


In words select, month January and get me the (travel) flow from origin 4 to destination 3.

After writing and reading the data to csv and back into pandas, the original referencing fails but if I convert the column indexing to string it works:

matrix.ix[1]['4'][3]


... the column names have automatically been tranformed from integer into string. But I would prefer the original indexing.
Any suggestions?

My current quick fix for handling the data after loading from csv is:

#Writing df to file
mulitindex_df_Travel_monthly.to_csv(r'result/Final_monthly_FlightData_countrylevel_v4.csv')


#Loading df from csv
test_matrix = pd.read_csv(filepath_inputdata+'/Final_monthly_FlightData_countrylevel_v4.csv',
index_col=[0, 1])


df = test_matrix.copy()
for column in test_matrix.columns.tolist():
df[int(column)] = df[column]
df.drop(column, axis=1, inplace=True)


CSV FILE:
https://www.dropbox.com/s/4u2opzh65zwcn81/travel_matrix_SO.csv?dl=0

example df

Answer

You can change the type of column names to integer with:

test_matrix.rename(columns = int, inplace = True)

To remove the index name:

test_matrix.index.names = [None, None]
Comments