Vikas Dubey Vikas Dubey - 8 days ago 4x
Perl Question

Finding and printing a specific Pattern in a given string

I am writing a code find a specific pattern in a given string using python or perl. I had some success in finding the pattern using C but python or perl usage is mandatory for this assignment and I am very new in both of these lanuages.

My string looks like this (Amino acid sequence) :-


The pattern I want to find is


Please note that Letters between K and K\R are not fixed. However, there is only letter between K\R and R. So, in the given string my pattern is like this and exist between letter no. 54 to 65 (if I counted correctly) based on "non-greedy" search :-


Previously, I was using C if-else condition to break this given string and printed out word count (not fully successful).

printf(%c, word[i]);
if ((word [i] == 'K' || word [i] == 'R' )) && word [i+2] == 'R') {

I agree It dint capture everything. If anyone can help me help me solving this problem, that would be great.


Regardless of the language, this looks a task suitable for regular expressions.

Here is an example of how you could do the regex in python

import re

s = 'MKTSGNQD...'
print re.findall(r'K[A-Z]+[KR][A-Z]R', s)

That should find and print out all non-overlapping matches in your string.

If you want the index where the match starts, you can do:

m ='K[A-Z]+[KR][A-Z]R', s)
print m.start()  # prints index
print  # prints matching string

Or as @bunji points out, you an use finditer as well:

for m in re.finditer(r'K[A-Z]+[KR][A-Z]R', s):
    print m.start()  # prints index
    print  # prints matching string