ale19 ale19 - 3 months ago 28
Python Question

pandas: all NaNs when subtracting two dataframes

I have two series. I want to subtract one dataframe from another dataframe, even though they have a different number of columns.

>df1

index 0 1 2 3 4 5
TOTAL 5 46 56 110 185 629

>df2
index 1 2 3 4 5
Use 25 37 86 151 512


I would assume that subtracting two dataframes with different dimensions would only result in NaNs in the mismatched columns (in this case, Column 0). The remaining columns would be the result of df1[1]-df2[1], df1[2]-df2[2], etc.

>df1 - df2
index 0 1 2 3 4 5
TOTAL NaN 21 19 24 34 117


But this is not the case. This is what happens when I subtract the dataframes?

>df1 - df2
index 0 1 2 3 4 5
Use NaN NaN NaN NaN NaN NaN
TOTAL NaN NaN NaN NaN NaN NaN


I also tried just subtracting the values:

>df1.values - df2.values
Traceback (most recent call last):

File "<ipython-input-376-1dc5b3b4ad3e>", line 1, in <module>
total_drugs.values-(restraints_drugs.values+norestraints_drugs.values)

ValueError: operands could not be broadcast together with shapes (1,6) (1,5)


What am I doing wrong? I'm using pandas 0.18.

Answer

You are subtracting two dataframes. Both column and row indices must match. In your case, the row indices TOTAL and Use do not match.

To get what you're looking for, you want to subtract the series df2.ix['Use'] from df1

df1.sub(df2.squeeze())

enter image description here

Or:

df1.sub(df2.ix['Use'])

Or:

df1.sub(df2.loc['Use'])

Or:

df1 - df2.ix['Use']

Or:

df1 - df2.loc['Use']