When playing around with the Python interpreter, I stumbled upon this conflicting case regarding the
>>> def func():
... a = 1000
... b = 1000
... return a is b
>>> a = 1000
>>> b = 1000
>>> a is b, func()
As the reference manual states:
A block is a piece of Python program text that is executed as a unit. The following are blocks: a module, a function body, and a class definition. Each command typed interactively is a block.
This is why, in the case of a function, you have a single code block which contains a single object for the numeric literal
id(a) == id(b) will yield
In the second case, you have two distinct code objects each with their own different object for the literal
id(a) != id(b).
Take note that this behavior doesn't manifest with
int literals only, you'll get similar results with, for example,
float literals (see here).
Of course, comparing objects should (except for explicit
is None tests ) should always be done with the equality operator
== and not
Everything stated here applies to the most popular implementation of Python, CPython. Other implementations might differ so no assumptions should be made when using them.
For the function
Along with all other attributes, function objects also have a
__code__ hook to allow you to peek into the compiled bytecode for that function. Using
dis.code_info we can get a nice pretty view of all stored attributes in a code object for a given function:
>>> print(dis.code_info(func)) Name: func Filename: <stdin> Argument count: 0 Kw-only arguments: 0 Number of locals: 2 Stack size: 2 Flags: OPTIMIZED, NEWLOCALS, NOFREE Constants: 0: None 1: 1000 Variable names: 0: a 1: b
We're only interested in the
Constants entry for function
func. In it, we can see that we have two values,
None (always present) and
1000. We only have a single int instance that represents the constant
1000. This is the value that
b are going to be assigned to when the function is invoked.
Accessing this value is easy via
func.__code__.co_consts and so, another way to view our
a is b evaluation in the function would be like so:
>>> id(func.__code__.co_consts) == id(func.__code__.co_consts)
Which, ofcourse, will evaluate to
True because we're refering to the same object.
For each interactive command:
As noted previously, each interactive command is interpreted as a single code block: parsed, compiled and evaluated independently.
We can get the code objects for each command via the
>>> com1 = compile("a=1000", filename="", mode="exec") >>> com2 = compile("b=1000", filename="", mode="exec")
For each assignment statement, we will get a similar looking code object which looks like the following:
>>> print(dis.code_info(com1)) Name: <module> Filename: Argument count: 0 Kw-only arguments: 0 Number of locals: 0 Stack size: 1 Flags: NOFREE Constants: 0: 1000 1: None Names: 0: a
The same command for
com2 looks the same but, has a fundamental difference, each of the code objects
com2 have different int instances representing the literal
1000. This is why, in this case, when we do
a is b via the
co_consts argument, we actually get:
>>> id(com1.co_consts) == id(com2.co_consts) False
Which agrees with what we actually got.
Different code objects, different contents.
Note: I was somewhat curious as to how exactly this happens in the source code and after digging through it I believe I finally found it.
/* snippet for brevity */ u->u_lineno = 0; u->u_col_offset = 0; u->u_lineno_set = 0; u->u_consts = PyDict_New(); /* snippet for brevity */
This assures that duplicate items (like
1000 which appeas twice in the function
func) will only get stored once.
See @Raymond Hettinger's answer below for a bit more on this.
Chained statements will evaluate to an identity check of
It should be more clear now why exactly the following evaluates to
>>> a = 1000; b = 1000; >>> a is b
In this case, by chaining the two assignment commands together we tell the interpreter to compile these together. As in the case for the function object, only one object for the literal
1000 will be created resulting in a
True value when evaluated.
Execution on a module level yields
As previously mentioned, the reference manual states that:
... The following are blocks: a module ...
So the same premise applies: we will have a single code object (for the module) and so, as a result, single values stored for each different literal.
The same doesn't apply for mutable objects:
Meaning that unless we explicitly initialize to the same mutable object (for example with a = b = ), the identity of the objects will never be equal, for example:
a = ; b =  a is b # always returns false
Again, in the documentation this is specified:
after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = ; d = , c and d are guaranteed to refer to two different, unique, newly created empty lists.