jcuenod jcuenod - 7 months ago 40
Python Question

Why is a sorted list bigger than an unsorted list

I have a list

(I don't know whether it's relevant, but the list is a filled with Unicode Hebrew strings):

>>> len(my_list)
>>> from sys import getsizeof
>>> getsizeof(my_list)

When I sort
though, it gets bigger:

>>> my_sorted_list = sort(my_list)
>>> len(my_sorted_list)
>>> getsizeof(my_sorted_list)

Why is
returning a list that takes more space in memory than the initial unsorted list?


As Ignacio points out this is due to Python allocating a bit more memory than required. sorted creates a new list out of the sequence provided, sorts it in place and returns it. To create the new list Python extends an empty sized list with the one passed; that results in over-allocation (after calling list_resize).

It is worth noting that this difference is mostly present when the original list was created with a list-comp (where if, a final append doesn't trigger a resize if space is available, the size is smaller) or when literals are used (for which no appends are made and direct assigning to the underlying array is performed, keeping the size to its minimum):

l = [i for i in range(1000)]
Out[85]: 9024

Out[84]: 9112

When creating through list, the memory is always over-allocated; Python knows the sizes and preempts future modifications by over-allocating a bit based on the size:

l = list(range(1000))
Out[86]: 9112

Out[87]: 9112

It's also required to point out that this is behavior specific the C implementation of Python. Jython, IronPython and others might/might not have the same behavior.