Richard Hall - 9 months ago 40

Python Question

I have two lists of equal length, one is a data series the other is simply a time series. They represent simulated values measured over time.

I want to create a function that removes a set percentage or fraction from both lists but at random. I.e. if my fraction is 0.2, I want to randomly remove 20% of the items from both lists, but they have to be the same items (same index in each list) removed.

For example, let n = 0.2 (20% to be deleted)

`a = [0,1,2,3,4,5,6,7,8,9]`

b = [0,1,4,9,16,25,36,49,64,81]

After the randomly removed 20%, they become

`a_new = [0,1,3,4,5,6,8,9]`

b_new = [0,1,9,16,25,36,64,81]

The relationship isn't as straightforward as the example, so I can't just perform this action on one list and then work out the second; they already exist as two lists. And they have to remain in the original order.

Thanks!

Answer

```
import random
a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]
frac = 0.2 # how much of a/b do you want to exclude
# generate a list of indices to exclude. Turn in into a set for O(1) lookup time
inds = set(random.sample(list(range(len(a))), int(frac*len(a))))
# use `enumerate` to get list indices as well as elements.
# Filter by index, but take only the elements
new_a = [n for i,n in enumerate(a) if i not in inds]
new_b = [n for i,n in enumerate(b) if i not in inds]
```

Source (Stackoverflow)