max max - 1 month ago 20
Python Question

How is __subclasses__ method implemented in CPython?

The docs say that:


Each class keeps a list of weak references to its immediate subclasses. This method returns a list of all those references still alive.


But how does each class obtain a list of weak references to its subclasses in the first place? In other words, when I create

class B(A):
pass


how does
A
find out that
B
just subclassed it? And is this mechanism robust enough to survive edge cases (custom metaclasses, assignment to
__bases__
, etc.)?

Answer

As part of the initialization of a new class, a weak reference to that class is added to the tp_subclasses member of each of its base classes. You can see this in the Python source code in Objects/typeobject.c:

int
PyType_Ready(PyTypeObject *type)
{
    ...
    /* Link into each base class's list of subclasses */
    bases = type->tp_bases;
    n = PyTuple_GET_SIZE(bases);
    for (i = 0; i < n; i++) {
        PyObject *b = PyTuple_GET_ITEM(bases, i);
        if (PyType_Check(b) &&
            add_subclass((PyTypeObject *)b, type) < 0)
            goto error;
    }
    ...
}

static int
add_subclass(PyTypeObject *base, PyTypeObject *type)
{
    int result = -1;
    PyObject *dict, *key, *newobj;

    dict = base->tp_subclasses;
    if (dict == NULL) {
        base->tp_subclasses = dict = PyDict_New();
        if (dict == NULL)
            return -1;
    }
    assert(PyDict_CheckExact(dict));
    key = PyLong_FromVoidPtr((void *) type);
    if (key == NULL)
        return -1;
    newobj = PyWeakref_NewRef((PyObject *)type, NULL);
    if (newobj != NULL) {
        result = PyDict_SetItem(dict, key, newobj);
        Py_DECREF(newobj);
    }
    Py_DECREF(key);
    return result;
}

The setter for __bases__ also updates the subclass lists of each of the old and new bases:

static int
type_set_bases(PyTypeObject *type, PyObject *new_bases, void *context)
{
    ...
    if (type->tp_bases == new_bases) {
        /* any base that was in __bases__ but now isn't, we
           need to remove |type| from its tp_subclasses.
           conversely, any class now in __bases__ that wasn't
           needs to have |type| added to its subclasses. */

        /* for now, sod that: just remove from all old_bases,
           add to all new_bases */
        remove_all_subclasses(type, old_bases);
        res = add_all_subclasses(type, new_bases);
        update_all_slots(type);
    }
    ...
}

Note that if a metaclass does something to customize the meaning of the subclass relationship, __subclasses__ won't reflect that. For example, issubclass(list, collections.abc.Iterable) is True, but list won't show up in a search of the __subclasses__ tree starting from collections.abc.Iterable.

Comments