user3821012 user3821012 - 1 month ago 9
Python Question

numpy: detect consecutive 1 in an array

I want to detect consecutive spans of 1's in a numpy array. Indeed, I want to first identify whether the element in an array is in a span of a least three 1's. For example, we have the following array a:

import numpy as np
a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])


Then the following 1's in bold are the elements satisfy the requirement.

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

Next, if two spans of 1's are separated by at most two 0's, then the two spans make up a longer span. So the above array is charaterized as

[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]

In other words, for the original array as input, I want the output as follows:

[True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]


I have been thinking of an algorithm to implement this function, but all the one I come up with seems to complicated. So I would love to know better ways to implement this -- it would be greatly appreciated if someone can help me out.

Answer

Instead of solving the conventional way of looping and maintaining count we can convert all 0's and 1's to a single string and replace a regex match with another char say 2. Once that is done we split out the string again and check for bool() over each char.

>>> import re
>>> lst=[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
>>> list(map(bool, map(int, list(re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))))))
[True, True, True, True, True, True, True, False, True, True, False, False, True, True, True, True, True, True, True, True, True, True, False]
>>> 

All the magic happens over here:

re.sub(r'1{3,}0{1,2}1{3,}', lambda x:'2'*len(x.group()), ''.join(map(str, lst)))

where it searches for a contiguous occurrence of 3 or more 1's followed by at most 2 0's i.e 1 or 2 0's followed by 3 or more 1's and replaces the whole matched string a same length string of 2's (used 2 because bool(2) is True). Also you can use tolist() method in NumPy to get a list out of NumPy array like this: np.array([1,2, 3, 4, 5, 6]).tolist()

Comments