Philipp Schwarz Philipp Schwarz - 1 year ago 70
Python Question

Indexing and selecting data from read_csv pandas df

I have constructed a matrix with integer values for columns and index. The matrix is acutally hierachical for each month. My problem is that the indexing and selecting of data does not work anymore as before when I write the data to csv and then load as pandas dataframe.

Selecting data before writing and reading data to file:

would for example give

In words select, month January and get me the (travel) flow from origin 4 to destination 3.

After writing and reading the data to csv and back into pandas, the original referencing fails but if I convert the column indexing to string it works:


... the column names have automatically been tranformed from integer into string. But I would prefer the original indexing.
Any suggestions?

My current quick fix for handling the data after loading from csv is:

#Writing df to file

#Loading df from csv
test_matrix = pd.read_csv(filepath_inputdata+'/Final_monthly_FlightData_countrylevel_v4.csv',
index_col=[0, 1])

df = test_matrix.copy()
for column in test_matrix.columns.tolist():
df[int(column)] = df[column]
df.drop(column, axis=1, inplace=True)


example df

Answer Source

You can change the type of column names to integer with:

test_matrix.rename(columns = int, inplace = True)

To remove the index name:

test_matrix.index.names = [None, None]