capitalistpug capitalistpug - 10 days ago 6
Python Question

Delete element from multi-dimensional numpy array by value

Given a numpy array

a = np.array([[0, -1, 0], [1, 0, 0], [1, 0, -1]])


what's the fastest way to delete all elements of value
-1
to get an array of the form

np.array([[0, 0], [1, 0, 0], [1, 0]])

Answer

Approach #1 : Using NumPy splitting of array -

def split_based(a):
    mask = a!=-1
    p = np.split(a[mask],mask.sum(1)[:-1].cumsum())
    out = np.array(list(map(list,p)))
    return out

Approach #2 : Using loop comprehension, but minimal work within the loop -

def loop_compr_based(a):
    mask = a!=-1
    stop = mask.sum(1).cumsum()
    start = np.append(0,stop[:-1])
    am = a[mask].tolist()
    out = np.array([am[start[i]:stop[i]] for i  in range(len(start))])
    return out

Sample run -

In [391]: a
Out[391]: 
array([[ 0, -1,  0],
       [ 1,  0,  0],
       [ 1,  0, -1],
       [-1, -1,  8],
       [ 3,  7,  2]])

In [392]: split_based(a)
Out[392]: array([[0, 0], [1, 0, 0], [1, 0], [8], [3, 7, 2]], dtype=object)

In [393]: loop_compr_based(a)
Out[393]: array([[0, 0], [1, 0, 0], [1, 0], [8], [3, 7, 2]], dtype=object)

Runtime test -

In [387]: a = np.random.randint(-2,10,(1000,1000))

In [388]: %timeit split_based(a)
10 loops, best of 3: 161 ms per loop

In [389]: %timeit loop_compr_based(a)
10 loops, best of 3: 29 ms per loop