I have two lists of equal length, one is a data series the other is simply a time series. They represent simulated values measured over time.
I want to create a function that removes a set percentage or fraction from both lists but at random. I.e. if my fraction is 0.2, I want to randomly remove 20% of the items from both lists, but they have to be the same items (same index in each list) removed.
For example, let n = 0.2 (20% to be deleted)
a = [0,1,2,3,4,5,6,7,8,9]
b = [0,1,4,9,16,25,36,49,64,81]
a_new = [0,1,3,4,5,6,8,9]
b_new = [0,1,9,16,25,36,64,81]
import random a = [0,1,2,3,4,5,6,7,8,9] b = [0,1,4,9,16,25,36,49,64,81] frac = 0.2 # how much of a/b do you want to exclude # generate a list of indices to exclude. Turn in into a set for O(1) lookup time inds = set(random.sample(list(range(len(a))), int(frac*len(a)))) # use `enumerate` to get list indices as well as elements. # Filter by index, but take only the elements new_a = [n for i,n in enumerate(a) if i not in inds] new_b = [n for i,n in enumerate(b) if i not in inds]