NumbThumb - 4 months ago 38

Python Question

In a numerical sequence (e.g. one-dimensional array) I want to find different patterns of numbers and count each finding separately. However, the numbers can occur repeatedly but only the basic pattern is important.

`# Example signal (1d array)`

a = np.array([1,1,2,2,2,2,1,1,1,2,1,1,2,3,3,3,3,3,2,2,1,1,1])

# Search for these exact following "patterns": [1,2,1], [1,2,3], [3,2,1]

# Count the number of pattern occurrences

# [1,2,1] = 2 (occurs 2 times)

# [1,2,3] = 1

# [3,2,1] = 1

I have come up with the Knuth-Morris-Pratt string matching (http://code.activestate.com/recipes/117214/), which gives me the index of the searched pattern.

`for s in KnuthMorrisPratt(list(a), [1,2,1]):`

print('s')

The problem is, I don't know how to find the case, where the pattern [1,2,1] "hides" in the sequence [1,2,2,2,1]. I need to find a way to reduce this sequence of repeated numbers in order to get to [1,2,1]. Any ideas?

Answer

I don't use NumPy and I am quite new to Python, so there might be a better and more efficient solution.

I would write a function like this:

```
def dac(data, pattern):
count = 0
for i in range(len(data)-len(pattern)+1):
tmp = data[i:(i+len(pattern))]
if tmp == pattern:
count +=1
return count
```

If you want to ignore repeated numbers in the middle of your pattern:

```
def dac(data, pattern):
count = 0
for i in range(len(data)-len(pattern)+1):
tmp = [data[i], data [i+1]]
try:
for j in range(len(data)-i):
print(i, i+j)
if tmp[-1] != data[i+j+1]:
tmp.append(data[i+j+1])
if len(tmp) == len(pattern):
print(tmp)
break
except:
pass
if tmp == pattern:
count +=1
return count
```

Hope that might help.