Marcelo Assis - 8 months ago 33

Python Question

My limited brain cannot understand why this happens:

`>>> print '' in 'lolsome'`

True

In PHP, a equivalent comparison returns false:

`var_dump(strpos('', 'lolsome'));`

Answer

For the Unicode and string types,

`x in y`

is true if and only ifxis a substring ofy. An equivalent test is`y.find(x) != -1`

. Note,xandyneed not be the same type; consequently,`u'ab' in 'abc'`

will return`True`

.Empty strings are always considered to be a substring of any other string, so`"" in "abc"`

will return`True`

.

From looking at your `print`

call, you're using 2.x.

To go deeper, look at the bytecode:

```
>>> def answer():
... '' in 'lolsome'
>>> dis.dis(answer)
2 0 LOAD_CONST 1 ('')
3 LOAD_CONST 2 ('lolsome')
6 COMPARE_OP 6 (in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
```

`COMPARE_OP`

is where we are doing our boolean operation and looking at the source code for `in`

reveals where the comparison happens:

```
TARGET(COMPARE_OP)
w = POP();
v = TOP();
x = cmp_outcome(oparg, v, w);
Py_DECREF(v);
Py_DECREF(w);
SET_TOP(x);
if (x == NULL) break;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
```

and where cmp_outcome is in the same file, it's easy to find our next clue:

```
res = PySequence_Contains(w, v);
```

which is in abstract.c:

```
{
Py_ssize_t result;
if (PyType_HasFeature(seq->ob_type, Py_TPFLAGS_HAVE_SEQUENCE_IN)) {
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
}
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
```

and to come up for air from the source, we find this next function in the documentation:

`objobjproc PySequenceMethods.sq_contains`

This function may be used by

`PySequence_Contains()`

and has the same signature. This slot may be left toNULL, in this case`PySequence_Contains()`

simply traverses the sequence until it finds a match.

and further down in the same documentation:

`int PySequence_Contains(PyObject *o, PyObject *value)`

Determine if

ocontainsvalue. If an item inois equal tovalue, return`1`

, otherwise return`0`

. On error, return`-1`

. This is equivalent to the Python expression`value in o`

.

Where `''`

isn't `null`

, the sequence `'lolsome'`

can be thought to contain it.