user3821012 user3821012 - 9 months ago 52
Python Question

numpy: detect consecutive 1 in an array

I want to detect consecutive spans of 1's in a numpy array. Indeed, I want to first identify whether the element in an array is in a span of a least three 1's. For example, we have the following array a:

import numpy as np
a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])


Then the following 1's in bold are the elements satisfy the requirement.

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

Next, if two spans of 1's are separated by at most two 0's, then the two spans make up a longer span. So the above array is charaterized as

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

In other words, for the original array as input, I want the output as follows:

[True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]


I have been thinking of an algorithm to implement this function, but all the one I come up with seems to complicated. So I would love to know better ways to implement this -- it would be greatly appreciated if someone can help me out.

Answer Source

Instead of solving the conventional way of looping and maintaining count we can convert all 0's and 1's to a single string and replace a regex match with another char say 2. Once that is done we split out the string again and check for bool() over each char.

>>> import re
>>> lst=[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> list(map(bool, map(int, list(re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))))))
[True, True, True, True, True, True, True, False, True, True, False, False, True, True, True, True, True, True, True, True, True, True, False]
>>> 

All the magic happens over here:

re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))

where it searches for a contiguous occurrence of 3 or more 1's followed by at most 2 0's i.e 1 or 2 0's followed by 3 or more 1's and replaces the whole matched string a same length string of 2's (used 2 because bool(2) is True). Also you can use tolist() method in NumPy to get a list out of NumPy array like this: np.array([1,2, 3, 4, 5, 6]).tolist()