d.putto d.putto - 4 months ago 9
Python Question

Set changes element order?

Recently I noticed that when I am converting list to set the order or elements is changed and is sorted by character.

Consider this example:

x=[1,2,20,6,210]
print x
# [1, 2, 20, 6, 210] # the order is same as initial order

set(x)
# set([1, 2, 20, 210, 6]) # in the set(x) output order is sorted


My questions are -


  1. Why is this happening?

  2. How can I do set operations (especially Set Difference) without losing the initial order?


Answer
  1. A set is an unordered data structure.

  2. Don't use a set, but rather collections.OrderedDict:

    >>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210])
    >>> b = collections.OrderedDict.fromkeys([6, 20, 1])
    >>> collections.OrderedDict.fromkeys(x for x in a if x not in b)
    OrderedDict([(2, None), (210, None)])
    

    Note that the order of b does not matter, so it could be any iterable, but it should be an iterable which supports O(1) membership tests.

Edit: The answer above assumes that you want to be able to perform (ordered) set operations on all occurring collections, in particular also on the result of a former set operation. If this is not necessary, you can simply use lists for some of the collections, and sets for others, e.g.

>>> a = [1, 2, 20, 6, 210]
>>> b = set([6, 20, 1])
>>> [x for x in a if x not in b]
[2, 210]

This loses the order of b, does not allow fast membership tests on a and the result. Sets allow fast membership tests, and lists keep order. If you need both these features on the same collection, then use collections.OrderedDict.

Comments