I'm using sklearn's pairwise distance function, which saved my life when computing a huge matrix, but the problem I'm having is that I lose my indices.
Specifically, I initially have a huge dataframe of 17000 x 300, which I break down into 4 different dataframes based on some class condition.
The 4 separate dataframes keep the original indices, but after I run the pairwise distance function on one of those dataframes, it gives me back a 2d array with correct values but the indices have been reset from 0 up.
How do I keep or recover the original indices?
distance1 = pair.pairwise_distances(df1, metric='euclidean')
You can create a DataFrame with matching indices using the DataFrame constructor taking the
Furthermore, if you would like to concatenate it horizontally to your existing DataFrame, you can use
pd.concat((df1, pd.DataFrame(distance1, index=df1.index)), axis=1)