wim wim - 11 months ago 46
Python Question

Modifying a dict during iteration

What's going on below

>>> d = {0:0}
>>> for i in d:
... del d[i]
... d[i+1] = 0
... print(i)

Why does the iteration stop at 8 without any error?

Reproducible on both python2.7 and python 3.5.

Answer Source

This behavior originates from the key look up algorithm in cpython static PyDictKeyEntry * lookdict(...) as written in the document:

The basic lookup function used by all operations. This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4. ... The initial probe index is computed as hash mod the table size (which initially equals to 8).

At the beginning of every for loop, the dict_next function is called internally to resolve the address of the next element. The core of this function reads:

value_ptr = &mp->ma_keys->dk_entries[i].me_value;
mask = DK_MASK(mp->ma_keys); // size of the array which stores the key values (ma_keys)
while (i <= mask && *value_ptr == NULL) { // skip NULL elements ahead 
    value_ptr = (PyObject **)(((char *)value_ptr) + offset);
if (i > mask)
    return -1; // raise StopIteration 

where i is the index of the C array which actually stores the values. As written above, the initial index of a key is calculated from hash(key)%table_size. The other element in the array is all set to NULL since the dict contains only one element in your test case.

Given the fact that hash(i)==i if i is an int, the memory layout of the dict in your example will be:

1st iter: [0,   NULL,NULL,NULL,NULL,NULL,NULL,NULL]; i=0
2nd iter: [NULL,1   ,NULL,NULL,NULL,NULL,NULL,NULL]; i=1
8th iter: [NULL,NULL,NULL,NULL,NULL,NULL,NULL,7   ]; i=7

A more interesting test case would be:

def f(d):
  for i in d:
    del d[i]
    print hash(i)%8
f({0:0})       # outputs 0,1,6
f({'hello':0}) # outputs 5,7
f({'world':0}) # outputs 1

To conclude, the exiting condition of such loop is