gvoysey gvoysey - 4 months ago 116
Python Question

pandas dataframe sort on column raises keyerror on index

I have the following dataframe,

df
:

peaklatency snr
0 52.99 0.0
1 54.15 62.000000
2 54.12 82.000000
3 54.64 52.000000
4 54.57 42.000000
5 54.13 72.000000


I'm attempting to sort this by
snr
:

df.sort_values(df.snr)


but this raises

_convert_to_indexer(self, obj, axis, is_setter)
1208 mask = check == -1
1209 if mask.any():
-> 1210 raise KeyError('%s not in index' % objarr[mask])
1211
1212 return _values_from_object(indexer)

KeyError: '[ inf 62. 82. 52. 42. 72.] not in index'


I am not explicitly setting an index on this DataFrame, it's coming from a list comprehension:

import pandas as pd
d = []
for run in runs:
d.append({
'snr': run.periphery.snr.snr,
'peaklatency': (run.brainstem.wave5.wave5.argmax() / 100e3) * 1e3
})
df = pd.DataFrame(d)

Answer

The by keyword to sort_values expects column names, not the actual Series itself. So, you'd want:

In [23]: df.sort_values('snr')
Out[23]: 
   peaklatency   snr
0        52.99   0.0
4        54.57  42.0
3        54.64  52.0
1        54.15  62.0
5        54.13  72.0
2        54.12  82.0