donkopotamus donkopotamus - 4 months ago 16
Python Question

Comparison of collections containing non-reflexive elements

In python, a value

x
is not always constrained to equal itself. Perhaps the best known example is
NaN
:

>>> x = float("NaN")
>>> x == x
False


Now consider a list of exactly one item. We might consider two such lists to be equal if and only the items they contained were equal. For example:

>>> ["hello"] == ["hello"]
True


But this does not appear to be the case with
NaN
:

>>> x = float("NaN")
>>> x == x
False
>>> [x] == [x]
True


So these lists of items that are "not equal", are "equal". But only sometimes ... in particular:


  • two lists consisting of the same instance of
    NaN
    are considered equal; while

  • two separate lists consisting of different instances of
    NaN
    are not equal



Observe:

>>> x = float("NaN")
>>> [x] == [x]
True
>>> [x] == [float("NaN")]
False


This general behaviour also applies to other collection types such as tuples and sets. Is there a good rationale for this?

Answer

Per the docs,

In enforcing reflexivity of elements, the comparison of collections assumes that for a collection element x, x == x is always true. Based on that assumption, element identity is compared first, and element comparison is performed only for distinct elements. This approach yields the same result as a strict element comparison would, if the compared elements are reflexive. For non-reflexive elements, the result is different than for strict element comparison, and may be surprising: The non-reflexive not-a-number values for example result in the following comparison behavior when used in a list:

 >>> nan = float('NaN')
 >>> nan is nan
 True
 >>> nan == nan
 False                 <-- the defined non-reflexive behavior of NaN
 >>> [nan] == [nan]
 True                  <-- list enforces reflexivity and tests identity first