user1355179 user1355179 - 10 months ago 68
Python Question

How to retrieve row and column names from data frame?

I have generated a correlation matrix from a dataframe using pandas.corr:

cmat = sub1.corr()

CESI001 1.000000 0.829723 0.046925 0.074475
CESI002 0.829723 1.000000 0.066766 0.073181
CESI003 0.046925 0.066766 1.000000 -0.098427
CESI004 0.074475 0.073181 -0.098427 1.000000

What I'm trying to do is generate a new dataframe consisting of [row, column, value] where the cell value meets some criteria. I have succeeded in retrieving the matching cell values:

for i2,r2 in cmat.iterrows():
for item in cmat[i2]:
if ((item > 0.3) and (item < 0.9)):
print (item)

This correctly produces:


However, I can't work backwards from there to retrieve the row and column names. I've tried .loc, .columname and several other approaches I read about here. I'm getting that Python is more about operating on the whole data frame. Any guidance appreciated.

Answer Source
  • stack to line up rows and columns
  • query to filter what you need

cmat.stack().to_frame('item').query('.3 < item < .9')

enter image description here