NumbThumb NumbThumb - 2 months ago 11
Python Question

Detect and count numerical sequence in Python array

In a numerical sequence (e.g. one-dimensional array) I want to find different patterns of numbers and count each finding separately. However, the numbers can occur repeatedly but only the basic pattern is important.

# Example signal (1d array)
a = np.array([1,1,2,2,2,2,1,1,1,2,1,1,2,3,3,3,3,3,2,2,1,1,1])

# Search for these exact following "patterns": [1,2,1], [1,2,3], [3,2,1]

# Count the number of pattern occurrences
# [1,2,1] = 2 (occurs 2 times)
# [1,2,3] = 1
# [3,2,1] = 1


I have come up with the Knuth-Morris-Pratt string matching (http://code.activestate.com/recipes/117214/), which gives me the index of the searched pattern.

for s in KnuthMorrisPratt(list(a), [1,2,1]):
print('s')


The problem is, I don't know how to find the case, where the pattern [1,2,1] "hides" in the sequence [1,2,2,2,1]. I need to find a way to reduce this sequence of repeated numbers in order to get to [1,2,1]. Any ideas?

Answer

I don't use NumPy and I am quite new to Python, so there might be a better and more efficient solution.

I would write a function like this:

def dac(data, pattern):
    count = 0
    for i in range(len(data)-len(pattern)+1):
        tmp = data[i:(i+len(pattern))]

        if tmp == pattern:
            count +=1

    return count

If you want to ignore repeated numbers in the middle of your pattern:

def dac(data, pattern):
    count = 0
    for i in range(len(data)-len(pattern)+1):
        tmp = [data[i], data [i+1]]

        try:
            for j in range(len(data)-i):
                print(i, i+j)
                if tmp[-1] != data[i+j+1]:
                    tmp.append(data[i+j+1])

                if len(tmp) == len(pattern):
                    print(tmp)
                    break
        except:
            pass

        if tmp == pattern:
            count +=1
    return count

Hope that might help.

Comments