Gaut Gaut - 20 days ago 5
Python Question

python iter over dict-like object

class Test(object):

def __init__(self, store):
assert isinstance(store, dict)
self.store = store

def __getitem__(self, key):
return self.store[key]


I try to iter over this class. It is said in this doc that implementing __getitem__ should be enough to iter over my Test class. Indeed, when I try to iter over it, it does not tell me that I can't, but I've got a KeyError:

In [10]: a = Test({1:1,2:2})

In [11]: for i in a: print i
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-11-8c9c9a8afa41> in <module>()
----> 1 for i in a: print i

<ipython-input-9-17212ae08f42> in __getitem__(self, key)
4 self.store = store
5 def __getitem__(self, key):
----> 6 return self.store[key]
7

KeyError: 0



  • Do you know where this 0 come from ? (what's going on under the hood)



I know I can solve it by adding an __iter__ function:

def __iter__(self):
return dict.__iter__(self.store)



  • Is it the best way to solve this problem ? (I might also inherit from dict class).


Answer

You missed a crucial wording in the documentation you found:

For sequence types, the accepted keys should be integers and slice objects. [...] [I]f of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised.

Note: for loops expect that an IndexError will be raised for illegal indexes to allow proper detection of the end of the sequence.

Bold italic emphasis is mine. If you accept keys, not integers, you don't have a sequence.

The Python glossary explains more; see the definition of sequence:

An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence. [...] Note that dict also supports __getitem__() and __len__(), but is considered a mapping rather than a sequence because the lookups use arbitrary immutable keys rather than integers.

So sequences accept integer indices, and that's exactly what for provides when iterating *. When given an object to iterate over, if there are no other means but __getitem__ is available, then a special iterator is constructed that starts at 0 and keeps increasing the counter until IndexError is raised. In pure Python that'd be:

def getitem_iterator(obj):
    getitem = type(obj).__getitem__  # special method, so on the type
    index = 0
    try:
        while True:
            yield getitem(obj, index)
            index += 1
    except IndexError:
        # iteration complete
        return

The actual implementation is in C, see the PySeqIter_Type definition and functions.

Implement the __iter__ method instead; it is used when present. Since you wrap a dictionary, you could simply return the iterator for that dictionary:

def __iter__(self):
    return iter(self.store)

* Technically speaking, for doesn't provide this. for just uses iter(obj) and it is that call that produces the special iterator when no __iter__ method is available instead.