omar omar - 5 years ago 1274
Python Question

How to get the cumulative distribution function with NumPy?

I want to create a CDF with NumPy, my code is the next:

histo = np.zeros(4096, dtype = np.int32)
for x in range(0, width):
for y in range(0, height):
histo[data[x][y]] += 1
q = 0
cdf = list()
for i in histo:
q = q + i
cdf.append(q)


I am walking by the array but take a long time the program execution. There is a built function with this feature, isn't?

Answer Source

I'm not really sure what your code is doing, but if you have hist and bin_edges arrays returned by numpy.histogram you can use numpy.cumsum to generate a cumulative sum of the histogram contents.

>>> import numpy as np
>>> hist, bin_edges = np.histogram(np.random.randint(0,10,100), normed=True)
>>> bin_edges
array([ 0. ,  0.9,  1.8,  2.7,  3.6,  4.5,  5.4,  6.3,  7.2,  8.1,  9. ])
>>> hist
array([ 0.14444444,  0.11111111,  0.11111111,  0.1       ,  0.1       ,
        0.14444444,  0.14444444,  0.08888889,  0.03333333,  0.13333333])
>>> np.cumsum(hist)
array([ 0.14444444,  0.25555556,  0.36666667,  0.46666667,  0.56666667,
        0.71111111,  0.85555556,  0.94444444,  0.97777778,  1.11111111])
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download