triphook triphook - 1 month ago 15
Python Question

Conditional multiplication of multiple series with another series

I would like to multiply (in place) values in one column of a DataFrame by values in another column, based on a condition in a third column. For example:

data = pd.DataFrame({'a': [1, 33, 56, 79, 2], 'b': [9, 12, 14, 5, 5], 'c': np.arange(5)})
data.loc[data.a > 10, ['a', 'b']] *= data.loc[data.a > 10, 'c']


What I would like this to do is multiply the values of both 'a' and 'b' by the corresponding (same row) value in 'c' based on a condition. However, the above code just results in NaN values in the desired range.

The closest workaround I've found has been to do this:

data.loc[data.a > 10, ['a', 'b']] = (data.loc[data.a > 10, ['a', 'b']].as_matrix().T * data.loc[data.a > 10, 'c']).T


which works, but it seems like there is a better (more Pythonic) way that I'm missing.

Answer

you can use mul(..., axis=0) method:

In [122]: mask = data.a > 10

In [125]: data.loc[mask, ['a','b']] = data.loc[mask, ['a','b']].mul(data.loc[mask, 'c'], 0)


In [126]: data
Out[126]:
     a   b  c
0    1   9  0
1   33  12  1
2  112  28  2
3  237  15  3
4    2   5  4