Eugene S Eugene S - 2 months ago 17
Python Question

Finding longest substring in alphabetical order

EDIT: I am aware that a question with similar task was already asked in SO but I'm interested to find out the problem in this specific piece of code. I am also aware that this problem can be solved without using recursion.

The task is to write a program which will find (and print) the longest sub-string in which the letters occur in alphabetical order. If more than 1 equally long sequences were found, then the first one should be printed. For example, the output for a string

abczabcd
will be
abcz
.

I have solved this problem with recursion which seemed to pass my manual tests. However when I run an automated tests set which generate random strings, I have noticed that in some cases, the output is incorrect. For example:

if
s = 'hixwluvyhzzzdgd'
, the output is
hix
instead of
luvy


if
s = 'eseoojlsuai'
, the output is
eoo
instead of
jlsu


if
s = 'drurotsxjehlwfwgygygxz'
, the output is
dru
instead of
ehlw


After some time struggling, I couldn't figure out what is so special about these strings that causes the bug.

This is my code:

pos = 0
maxLen = 0
startPos = 0
endPos = 0


def last_pos(pos):
if pos < (len(s) - 1):
if s[pos + 1] >= s[pos]:
pos += 1
if pos == len(s)-1:
return len(s)
else:
return last_pos(pos)
return pos


for i in range(len(s)):
if last_pos(i+1) != None:
diff = last_pos(i) - i
if diff - 1 > maxLen:
maxLen = diff
startPos = i
endPos = startPos + diff

print s[startPos:endPos+1]


BTW, this is a homework. I wanted to tag it as one but I couldn't find the homework tag.

Answer

There are many things to improve in your code but making minimum changes so as to make it work. The problem is you should have if last_pos(i) != None: in your for loop (i instead of i+1) and you should compare diff (not diff - 1) against maxLen. Please read other answers to learn how to do it better.

for i in range(len(s)):
    if last_pos(i) != None:
        diff = last_pos(i) - i + 1
    if diff > maxLen:
        maxLen = diff
        startPos = i
        endPos = startPos + diff - 1