Fang Fang - 3 years ago 172
Python Question

How to select all non-NaN columns and non-NaN last column using pandas?

Forgive me if the title a little bit confusing.

Assuming I have

test.h5
. Below is the result of reading this file using
df.read_hdf('test.h5', 'testdata')


0 1 2 3 4 5 6
0 123 444 111 321 NaN NaN NaN
1 12 234 113 67 21 32 900
3 212 112 543 321 45 NaN NaN


I want to select the last non-Nan column. My expected result is like this

0 321
1 900
2 45


Also I want to select all column except the last non-NaN column. My expected result perhaps is like this. It might can be in numpy array but I have not foud any solution yet.

0 1 2 3 4 5 6
0 123 444 111
1 12 234 113 67 21 32
3 212 112 543 321


I searched online and found
df.iloc[:, :-1]
for reading all column but the last one and
df.iloc[:, -1]
for reading the last column.

My current result using these 2 command is like this:
1. for reading all column except the last one

0 1 2 3 4 5
0 123 444 111 321 NaN NaN
1 12 234 113 67 21 32
3 212 112 543 321 45 NaN


2.for reading the last column

0 NaN
1 900
2 Nan


My question is, is there any command or query used in pandas to address these condition?

Thank you for any help and suggestion.

Answer Source

You can use sorted to satisfy your condition i.e

ndf = df.apply(lambda x : sorted(x,key=pd.notnull),1)

This will give

     0      1      2      3      4      5      6
0   NaN    NaN    NaN  123.0  444.0  111.0  321.0
1  12.0  234.0  113.0   67.0   21.0   32.0  900.0
3   NaN    NaN  212.0  112.0  543.0  321.0   45.0

Now you can select the last column i.e

ndf.iloc[:,-1]
0    321.0
1    900.0
3     45.0
Name: 6, dtype: float64
ndf.iloc[:,:-1].apply(lambda x : sorted(x,key=pd.isnull),1)
      0      1      2      3     4     5
0  123.0  444.0  111.0    NaN   NaN   NaN
1   12.0  234.0  113.0   67.0  21.0  32.0
3  212.0  112.0  543.0  321.0   NaN   NaN
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download