aamir23 aamir23 - 7 months ago 33
Python Question

Python sum() has a different result after importing numpy

I came across this problem by Jake VanderPlas and I am not sure if my understanding of why the result differs after importing the numpy module is entirely correct.

>> 9
>> from numpy import *
>> print(sum(range(5),-1))
>> 10

It seems like in the first scenario the sum function calculates the sum over the iterable and then subtracts the second args value from the sum.

In the second scenario, after importing numpy, the behavior of the function seems to have modified as the second arg is used to specify the axis along which the sum should be performed.

Exercise number (24)
Source - http://www.labri.fr/perso/nrougier/teaching/numpy.100/index.html


Only to add my 5 pedantic coins to @Warren Weckesser answer. Really from numpy import * does not overwrite the builtins sum function, it only shadows __builtins__.sum, because from ... import * statement binds all names defined in the imported module, except those beginning with an underscore, to your current global namespace. And according to Python's name resolution rule (unofficialy LEGB rule), the global namespace is looked up before __builtins__ namespace. So if Python finds desired name, in your case sum, it returns you the binded object and does not look further.

EDIT: To show you what is going on:

 In[1]: print(sum, ' from ', sum.__module__)    # here you see the standard `sum` function
Out[1]: <built-in function sum>  from  builtins

 In[2]: from numpy import *                     # from here it is shadowed
        print(sum, ' from ', sum.__module__)
Out[2]: <function sum at 0x00000229B30E2730>  from  numpy.core.fromnumeric

 In[3]: del sum                                 # here you restore things back
        print(sum, ' from ', sum.__module__)
Out[3]: <built-in function sum>  from  builtins

First note: del does not delete objects, it is a task of garbage collector, it only "dereference" the name-bindings and delete names from current namespace.

Second note: the signature of built-in sum function is sum(iterable[, start]):

Sums start and the items of an iterable from left to right and returns the total. start defaults to 0. The iterable‘s items are normally numbers, and the start value is not allowed to be a string.

I your case print(sum(range(5),-1) for built-in sum summation starts with -1. So technically, your phrase the sum over the iterable and then subtracts the second args value from the sum isn't correct. For numbers it's really does not matter to start with or add/subtract later. But for lists it does (silly example only to show the idea):

 In[1]: sum([[1], [2], [3]], [4])
Out[1]: [4, 1, 2, 3]               # not [1, 2, 3, 4]

Hope this will clarify your thoughts :)