MAA - 1 year ago 207

Python Question

This is my code for bucket sort in Python.

`from random import randrange`

def insertion_sort(aList):

for i in range(1, len(aList)):

for j in range(i, 0, -1):

if aList[j] < aList[j-1]:

aList[j], aList[j-1] = aList[j-1], aList[j]

return aList

def bucket_sort(aList):

buckets = [[]] * len(aList)

for index, value in enumerate(aList):

buckets_index = value * len(aList) // (max(aList) + 1)

buckets[buckets_index].append(value)

answer = []

for bucket in buckets:

answer.extend(insertion_sort(bucket))

# answer += insertion_sort(bucket)

print(buckets[0])

print("\n")

# return answer

aList = [randrange(10) for _ in range(100)]

print(aList)

print("\n")

answer = bucket_sort(aList)

#print(answer)

What is happening? When I run the code, I always find that the first list in buckets is already sorted and the other lists in buckets are all copies of it.

Do I need the insertion sort for each list?

What would I use the "answer" variable for?!

I'm mainly relying on this visualization.

Answer Source

One thing that i notice right off the bat is that you initialize your variable buckets as `buckets = [[]] * len(aList)`

. This makes a list of identical copies of the empty list. As such, any modification of this list is replicated in every element of `buckets`

. Change this line to:

```
buckets = [[] for _ in xrange(len(aList))]
```

To check if the lists inside the list are separate object, you could check their id's:

```
print [id(x) for x in buckets]
```

This should print a list of unique numbers.