piRSquared - 9 months ago 45

Python Question

Answer Source

You could use a numpy.random.choice to generate a mask:

```
import numpy as np
mask = np.random.choice([True, False], size=(10,10), p=[.2,.8])
df.mask(mask)
```

In one line (and with size based on the `df`

as @root suggests):

```
df.mask(np.random.choice([True, False], size=df.shape, p=[.2,.8]))
```

Speed tested using `timeit`

at ~770μs:

```
>>> python -m timeit -n 10000
-s "import pandas as pd;import numpy as np;df=pd.DataFrame(np.ones((10,10))*2)"
"df.mask(np.random.choice([True,False], size=df.shape, p=[.2,.8]))"
10000 loops, best of 3: 770 usec per loop
```