user1911092 user1911092 - 1 month ago 9
Python Question

Comparing columns of 2 dataframes

I am trying to get the columns that are unique to a data frame.

DF_A has 10 columns
DF_B has 3 columns (all three match column names in DF_A).

Before I was using:

cols_to_use = DF_A.columns - DF_B.columns.

Since my pandas update, I am getting this error:
TypeError: cannot perform sub with this index type:

What should I be doing now instead?

Thank you!

Answer

You can use difference method:

Demo:

In [12]: df
Out[12]:
   a  b  c  d
0  0  8  0  3
1  3  4  1  7
2  0  5  4  0
3  0  9  7  0
4  5  8  5  4

In [13]: df2
Out[13]:
   a  d
0  4  3
1  3  1
2  1  2
3  3  4
4  0  3

In [14]: df.columns.difference(df2.columns)
Out[14]: Index(['b', 'c'], dtype='object')

In [15]: cols = df.columns.difference(df2.columns)

In [16]: df[cols]
Out[16]:
   b  c
0  8  0
1  4  1
2  5  4
3  9  7
4  8  5
Comments