Apparently python will allow me to hash a generator expression like
(i for i in [1, 2, 3, 4, 5])
>>> hash(i for i in [1, 2, 3, 4, 5])
>>> hash(i for i in range(2))
>>> hash(i for i in [1, 2, 3, 4, 5, 6])
>>> hash(i for i in [0, 1, 2, 3, 4, 5, 6])
So there are two questions here actually:
For 2, the answer is simple: In this case, hash is based on the object's
id. Since you don't actually store the object, its memory gets reused. That means the next generator has the same
id and thus hash.
For 1, the answer is "because they can".
hash is primarily meant for use in
set and other situations where it allows identifying an object. These situations set the constraint that
a == b also implies
hash(a) == hash(b) (the reverse is not constrained).
dict and other collections, equality is based on content.
[1,2,3] == [1,2,3] regardless whether both are the same objects, for example. This means if something is added to them, their equality changes and thus their
hash would change as well. Thus,
hash is undefined, as it must be a constant for it to work in
In contrast, a generator can have any content. Consider for example a generator providing random values. Thus, it makes no sense to compare generators by content. They are only ever compared by identity. So,
a == b equals
id(a) == id(b) for generators. In turn, this means basing
id(a) will always satisfy the constraint by equality on