triphook - 4 months ago 38

Python Question

I would like to multiply (in place) values in one column of a DataFrame by values in another column, based on a condition in a third column. For example:

`data = pd.DataFrame({'a': [1, 33, 56, 79, 2], 'b': [9, 12, 14, 5, 5], 'c': np.arange(5)})`

data.loc[data.a > 10, ['a', 'b']] *= data.loc[data.a > 10, 'c']

What I would like this to do is multiply the values of both 'a' and 'b' by the corresponding (same row) value in 'c' based on a condition. However, the above code just results in NaN values in the desired range.

The closest workaround I've found has been to do this:

`data.loc[data.a > 10, ['a', 'b']] = (data.loc[data.a > 10, ['a', 'b']].as_matrix().T * data.loc[data.a > 10, 'c']).T`

which works, but it seems like there is a better (more Pythonic) way that I'm missing.

Answer

you can use mul(..., axis=0) method:

```
In [122]: mask = data.a > 10
In [125]: data.loc[mask, ['a','b']] = data.loc[mask, ['a','b']].mul(data.loc[mask, 'c'], 0)
In [126]: data
Out[126]:
a b c
0 1 9 0
1 33 12 1
2 112 28 2
3 237 15 3
4 2 5 4
```