Vasilis Lemonidis - 1 year ago 268
Python Question

# Fast way to find nonzero elements positions in 2d array in Python

The reason why I create this question is that I think numpy.nonzero function is not as optimized as it should. The following example shows this fact:

``````a=np.random.random((1000,1000))
a[a<0.5]=0
timeit.timeit(lambda:np.nonzero(a),number=100)/100.0
#gives 0.014248089790344238 secs
timeit.timeit(lambda:np.nonzero(a>0),number=100)/100.0
#gives 0.010561790466308594 secs
``````

I don't understand the fact why this happens, as I believe that both lines produce the same result for every input, whatever the dtype, thus I name it a bug, correct me if wrong. Anyways, there is a severe lack of optimization. Another way to increase the speed of finding nonzero elements is , if the function is called multiple times, the size of the input array is static and the array is sparse, to create a 3d array holding all the positions of the input array and then perform boolean slicing of this array, using the corresponding mask, ie:

``````def check_speed(a,positions,thres):
a[a<thres]=0
print 'Thresholding at:',thres
print '\t Number of nonzero elements:',np.sum(a>0)
print '\t Using numpy nonzero:', timeit.timeit(lambda:np.transpose(np.nonzero(a)),number=100)/100.0, 's'
print '\t Using improved numpy nonzero:', timeit.timeit(lambda:np.transpose(np.nonzero(a>0)),number=100)/100.0, 's'

a=np.random.random((1000,1000))
positions = np.transpose(np.nonzero(np.ones_like(a))).reshape(a.shape + (2,))
check_speed(a,positions,0.5)
check_speed(a,positions,0.8)
check_speed(a,positions,0.9)
check_speed(a,positions,0.995)
check_speed(a,positions,0.998)
check_speed(a,positions,0.999)
check_speed(a,positions,0.9995)
``````

with result:

``````Thresholding at: 0.5
Number of nonzero elements: 499248
Using numpy nonzero: 0.0171903395653 s
Using improved numpy nonzero: 0.0139024186134 s
Thresholding at: 0.8
Number of nonzero elements: 199189
Using numpy nonzero: 0.0108882308006 s
Using improved numpy nonzero: 0.00774354934692 s
Thresholding at: 0.9
Number of nonzero elements: 99493
Using numpy nonzero: 0.00896275043488 s
Using improved numpy nonzero: 0.00562391996384 s
Thresholding at: 0.995
Number of nonzero elements: 4849
Using numpy nonzero: 0.00732887983322 s
Using improved numpy nonzero: 0.00353765010834 s
Thresholding at: 0.998
Number of nonzero elements: 1930
Using numpy nonzero: 0.00729090929031 s
Using improved numpy nonzero: 0.00361618041992 s
Thresholding at: 0.999
Number of nonzero elements: 959
Using numpy nonzero: 0.00952008008957 s
Using improved numpy nonzero: 0.00364511966705 s
Thresholding at: 0.9995
Number of nonzero elements: 490
Using numpy nonzero: 0.00758473873138 s
Using improved numpy nonzero: 0.00350986003876 s
``````

Below 10000 nonzero elements in a 1000*1000 array the method seems to give faster results. I guess this happens because there is a bottleneck between memory manipulation and calculations. There was no reason to research scipy equivalents, as by experience scipy is terribly slow relatively to numpy module. So, my question is:

Does anyone know of a better way to find nonzero elements positions inside an array? Thank you beforehand for your answers.

def check_speed(a,positions,thres):
a[a

``````    print 'Thresholding at:',thres
print '\t Number of nonzero elements:',np.sum(a>0)
print '\t Using cv2:', timeit.timeit(lambda:np.fliplr(cv2.findNonZero((a>0).astype(np.uint8)).squeeze()),number=100)/100.0, 's'
print '\t Using numpy nonzero:', timeit.timeit(lambda:np.transpose(np.array(np.nonzero(a))),number=100)/100.0, 's'
print '\t Using improved numpy nonzero:', timeit.timeit(lambda:np.transpose(np.nonzero(a>0)),number=100)/100.0, 's'
a=np.random.random((1000,1000))
positions = np.transpose(np.nonzero(np.ones_like(a))).reshape(a.shape + (2,))
a[a<0.5]=0
check_speed(a,positions,0.5)
check_speed(a,positions,0.8)
check_speed(a,positions,0.9)
check_speed(a,positions,0.995)
check_speed(a,positions,0.998)
check_speed(a,positions,0.999)
check_speed(a,positions,0.9995)
``````

which gives:

``````Thresholding at: 0.5
Number of nonzero elements: 499806
Using cv2: 0.00592091083527 s
Using numpy nonzero: 0.0176041412354 s
Using improved numpy nonzero: 0.0138215017319 s
Thresholding at: 0.8
Number of nonzero elements: 200022
Using cv2: 0.00379456996918 s
Using numpy nonzero: 0.0111160302162 s
Using improved numpy nonzero: 0.00776970148087 s
Thresholding at: 0.9
Number of nonzero elements: 99835
Using cv2: 0.00287639141083 s
Using numpy nonzero: 0.00894243955612 s
Using improved numpy nonzero: 0.00543779850006 s
Thresholding at: 0.995
Number of nonzero elements: 4931
Using cv2: 0.00203591108322 s
Using numpy nonzero: 0.00747500181198 s
Using improved numpy nonzero: 0.00370697021484 s
Thresholding at: 0.998
Number of nonzero elements: 2018
Using cv2: 0.00203846931458 s
Using numpy nonzero: 0.00735512971878 s
Using improved numpy nonzero: 0.0036293721199 s
Thresholding at: 0.999
Number of nonzero elements: 1038
Using cv2: 0.00200258970261 s
Using numpy nonzero: 0.00735764026642 s
Using improved numpy nonzero: 0.00369818925858 s
Thresholding at: 0.9995
Number of nonzero elements: 521
Using cv2: 0.0019414305687 s
Using numpy nonzero: 0.00776562213898 s
Using improved numpy nonzero: 0.00395655155182 s
Last time I checked `cv2.findNonZero` was faster than `numpy.nonzero`. But you may need to make some conversions.