Cleb - 1 month ago 9

Python Question

I have two dataframes like this

`import pandas as pd`

import numpy as np

np.random.seed(0)

df1 = pd.DataFrame(np.random.randint(10, size=(5, 4)), index=list('ABCDE'), columns=list('abcd'))

df2 = pd.DataFrame(np.random.randint(10, size=(2, 4)), index=list('CE'), columns=list('abcd'))

a b c d

A 5 0 3 3

B 7 9 3 5

C 2 4 7 6

D 8 8 1 6

E 7 7 8 1

a b c d

C 5 9 8 9

E 4 3 0 3

The index of

`df2`

`df1`

I want to create a third dataframe

`df3 = df1 - df2`

`a b c d`

A NaN NaN NaN NaN

B NaN NaN NaN NaN

C -3.0 -5.0 -1.0 -3.0

D NaN NaN NaN NaN

E 3.0 4.0 8.0 -2.0

I don't want the

`NAs`

`df1`

`fillna`

`df1`

`df2`

A workaround would be to do the subtract only the required rows like:

`sub_ind = df2.index`

df3 = df1.copy()

df3.loc[sub_ind, :] = df1.loc[sub_ind, :] - df2.loc[sub_ind, :]

which gives me the desired output

`a b c d`

A 5 0 3 3

B 7 9 3 5

C -3 -5 -1 -3

D 8 8 1 6

E 3 4 8 -2

but maybe there is a more straightforward way of achieving this?

Answer Source

If you use the `sub`

method instead of `-`

, you can pass a fill value:

```
df1.sub(df2, fill_value=0)
Out:
a b c d
A 5.0 0.0 3.0 3.0
B 7.0 9.0 3.0 5.0
C -3.0 -5.0 -1.0 -3.0
D 8.0 8.0 1.0 6.0
E 3.0 4.0 8.0 -2.0
```