Fringant Fringant - 2 months ago 5
Python Question

Python pandas: Select columns where a specific row satisfies a condition

I have a dataframe dfall where there is a row labeled 'row1' with values 'foo' and 'bar'. I want to select only columns of dfall where 'row1' has the value 'foo'.

In other words:

dfall= pd.DataFrame([['bar','foo'],['bla','bli']], columns=['col1','col2'], index=['row1','row2'])


I want as result the column 'col2'containing:
['foo','bli']


I tried:

dfall[dfall.loc['row1'].isin(['foo'])]


I get the error

IndexingError: Unalignable boolean Series key provided


Can anybody help me with the command? Thanks in advance!

Answer

You can compare your df against the scalar value, and then use any with axis=0 and pass this boolean mask to ix:

In [324]:
df.ix[:,(df == 'foo').any(axis=0)]

Out[324]:
     col2
row1  foo
row2  bli

breaking the above down:

In [325]:
df == 'foo'

Out[325]:
       col1   col2
row1  False   True
row2  False  False

In [326]:
(df == 'foo').any(axis=0)

Out[326]:
col1    False
col2     True
dtype: bool
Comments