joshua.r.smith joshua.r.smith - 3 months ago 7
Python Question

Functionality of Python `in` vs. `__contains__`

I implemented the

__contains__
method on a class for the first time the other day, and the behavior wasn't what I expected. I suspect there's some subtlety to the
in
operator that I don't understand and I was hoping someone could enlighten me.

It appears to me that the
in
operator doesn't simply wrap an object's
__contains__
method, but it also attempts to coerce the output of
__contains__
to boolean. For example, consider the class

class Dummy(object):
def __contains__(self, val):
# Don't perform comparison, just return a list as
# an example.
return [False, False]


The
in
operator and a direct call to the
__contains__
method return very different output:

>>> dum = Dummy()
>>> 7 in dum
True
>>> dum.__contains__(7)
[False, False]


Again, it looks like
in
is calling
__contains__
but then coercing the result to
bool
. I can't find this behavior documented anywhere except for the fact that the
__contains__
documentation says
__contains__
should only ever return
True
or
False
.

I'm happy following the convention, but can someone tell me the precise relationship between
in
and
__contains__
?

Epilogue



I decided to choose @eli-korvigo answer, but everyone should look at @ashwini-chaudhary comment about the bug, below.

Answer

Use the source, Luke!

Let's trace down the in operator implementation

>>> import dis
>>> class test(object):
...     def __contains__(self, other):
...         return True

>>> def in_():
...     return 1 in test()

>>> dis.dis(in_)
    2           0 LOAD_CONST               1 (1)
                3 LOAD_GLOBAL              0 (test)
                6 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
                9 COMPARE_OP               6 (in)
               12 RETURN_VALUE

As you can see, the in operator becomes the COMPARE_OP virtual machine instruction. You can find that in ceval.c

TARGET(COMPARE_OP)
    w = POP();
    v = TOP();
    x = cmp_outcome(oparg, v, w);
    Py_DECREF(v);
    Py_DECREF(w);
    SET_TOP(x);
    if (x == NULL) break;
    PREDICT(POP_JUMP_IF_FALSE);
    PREDICT(POP_JUMP_IF_TRUE);
    DISPATCH(); 

Take a look at one of the switches in cmp_outcome()

case PyCmp_IN:
    res = PySequence_Contains(w, v);
    if (res < 0)
         return NULL;
    break;

Here we have the PySequence_Contains call

int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
    Py_ssize_t result;
    PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
    if (sqm != NULL && sqm->sq_contains != NULL)
        return (*sqm->sq_contains)(seq, ob);
    result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
    return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}

That always returns an int (a boolean).

P.S.

Thanks to Martijn Pieters for providing the way to find the implementation of the in operator.