Azia Azia - 1 month ago 23
Python Question

Approximate pattern matching?

I am trying to write code for Approximate Pattern Matching which is as below:

def HammingDistance(p, q):
d = 0
for p, q in zip(p, q): # your code here
if p!= q:
d += 1
return d
Pattern = "ATTCTGGA"
Text = "CGCCCGAATCCAGAACGCATTCCCATATTTCGGGACCACTGGCCTCCACGGTACGGACGTCAATCAAAT"
d = 3
def ApproximatePatternMatching(Pattern, Text, d):
positions = [] # initializing list of positions
for i in range(len(Text) - len(Pattern)+1):
if Pattern == Text[i:i+len(Pattern)]:
positions.append(i)# your code here
return positions
print (ApproximatePatternMatching(Pattern, Text, d))


I keep getting the following error:
Failed test #3. You may be failing to account for patterns starting at the first index of text.

Test Dataset:

GAGCGCTGG
GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT
2


Your output:

['[]', '0']


Correct output:

['0', '30', '66']


Can not figure out what I am doing wrong as I am trying to learn python so don't have any idea about programming. Need help?

Answer

I'm unsure why you're getting an empty list as one of your outputs - when I run your code above I only get [0] as the print out.

Specifically, your code at present only checks for an exact character substring match, without using the hamming distance definition you also included.

The following should return the result you expect:

Pattern = "GAGCGCTGG"
Text = "GAGCGCTGGGTTAACTCGCTACTTCCCGACGAGCGCTGTGGCGCAAATTGGCGATGAAACTGCAGAGAGAACTGGTCATCCAACTGAATTCTCCCCGCTATCGCATTTTGATGCGCGCCGCGTCGATT"
d = 3

def HammingDistance(p, q):
    d = 0
    for p, q in zip(p, q): # your code here
        if p!= q:
            d += 1
    return d

def ApproximatePatternMatching(Pattern, Text, d):
    positions = [] # initializing list of positions
    for i in range(len(Text) - len(Pattern)+1):
        # and using distance < d, rather than exact matching
        if HammingDistance(Pattern, Text[i:i+len(Pattern)]) < d:
            positions.append(i)
    return positions

print (ApproximatePatternMatching(Pattern, Text, d))