Shreyas Gune Shreyas Gune - 6 months ago 41
Python Question

Bottom Up MergeSort using Python

I've spent countless hours trying to do this. Could anyone point out my mistake?

a
is just a list,
tmp
is an empty list of size
len(a)


z
is basically
len(a)



a = [6,5,4,3,2,1]
print 'unsorted:',a
z = len(a)
tmp = range(len(a))


Here's my sort function:

def sort(a,tmp):
width=1
while(width<z):
p=0
while(p<z):
left =p
middle = p+width
right = p+2*width
merge(a,left,middle,right,tmp)
p = p+2*width
t=0
while(t<z):
a[t]=tmp[t]
t=t+1
width=width*2
print '\n'


and here's merge function :

def merge(a,iLeft,iMiddle,iRight,tmp):
i = iLeft
j = iMiddle
k = iLeft
print iLeft,iMiddle,iRight,'[',i,j,k,']'
#print i ,j, k,'\n\n'

while(i<iMiddle or j<iRight):
if(i<iMiddle and j<iRight):
if(a[i]<a[j]):
tmp[k]=a[i]
i += 1
k += 1
else:
tmp[k]=a[j]
j += 1
k += 1

elif (i==iMiddle):
tmp[k] = a[j]
j += 1
k += 1
elif (j==iRight):
tmp[k]= a[i]
i += 1
k += 1


[This Visualization] might help. I still don't know why it's acting this way.

I'm not sure where I am going wrong? Is it the indentation or the looping?

Output ::
unsorted: [6, 5, 4, 3, 2, 1]
0 1 2 [ 0 1 0 ]
2 3 4 [ 2 3 2 ]
4 5 6 [ 4 5 4 ]


0 2 4 [ 0 2 0 ]
4 6 8 [ 4 6 4 ]
Traceback (most recent call last):
File "BUmer.py", line 45, in <module>
sort(a,tmp)
File "BUmer.py", line 14, in sort
merge(a,left,middle,right,tmp)
File "BUmer.py", line 27, in merge
if(a[i]<a[j]):
IndexError: list index out of range


This visualization may help. I still don't know why this is happening.

Answer

Although it was a valiant effort the primary mistake you made was with the nested flow control approach -- the human mind can only handle so many nested while-loops. Additionally, the fact that the merge function modifies a in place makes it extremely difficult to ascertain what is happening. Even if you have an amazing mind that can keep track of all that action, save that brain energy for tackling the problem rather than flow control!

In general, you want to do your best to keep your program flat -- avoid nested flow control. Additionally, unless you are doing dedicated object-oriented-programming, instead of modifying a in place you should attempt to return a specific value.

Here is another approach to mergesort that tries to keep things more flat and explicit:

def merge(merged, a, b):

    if a and b:
        return merge(merged + [min(a[0], b[0])],
                     a[1:] if min(a[0], b[0]) == a[0] else a,
                     b[1:] if min(a[0], b[0]) == b[0] and not a[0] == b[0] else b
                     )

    elif a and not b and len(a) > 1:
        return merged + merge([], a[:len(a)/2], a[len(a)/2:])
    elif a and not b:
        return merged + a
    elif b and not a  and len(b) > 1:
        return merged + merge([], b[:len(b)/2], b[len(b)/2:])
    elif b and not a:
        return merged + b
    else:
        return merged

def mergesort(lst):

    if not lst:
        return []
    elif len(lst) == 2:
        return [min(lst), max(lst)]
    elif len(lst) == 1:
        return lst
    else:
        return merge([], mergesort(lst[:len(lst)/2]), mergesort(lst[len(lst)/2:]))

This effort to keep things explicit and minimize flow control structures is common in a style of programming known as functional programming. There's a great free book on it that you can access here:

http://www.oreilly.com/programming/free/files/functional-programming-python.pdf

Realize this might not be the exact answer you're looking for but I hope it helps.