Elgin Cahangirov Elgin Cahangirov - 1 month ago 9
Python Question

Speeding up sliding windowed average calculations

I have some data(stock data) and need to manipulate it by making some calculations on that data. I did it with numpy arrays. Numpy is pretty faster than python built-in functions. But, the execution time of my code is higher than expected. My code is in below and I tested it with ipython %timeit function. The result is like this: Total execution time is 5.44 ms, second 'for' loop takes most time 3.88ms and cause for that, is 'np.mean' function in that loop. So, alternatives to 'np.mean' and any other suggestions to speed up execution time would be helpful.

Code

data = my_class.Project.all_data["AAP_data"]
data = np.array(data[["High", "Low", "Close"]])

true_range = np.empty((data.shape[0]-1, 1))
for i in range(1, true_range.shape[0]+1):
true_range[i-1] = max((data[i, 0] - data[i, 1]), (abs(data[i, 0] - data[i-1, 2])),
(abs(data[i, 1] - data[i-1, 2])))

average_true_range = np.empty((true_range.shape[0]-13, 1))
for i in range(13, average_true_range.shape[0]+13):
lastn_tr = true_range[(i-13):(i+1)]
average_true_range[i-13] = np.mean(lastn_tr)

Answer

That is basically sliding window average calculations. This averaging could be thought of as summing in sliding windows and dividing by the length of window size. So, we can use 1D convolution with np.convolve for a vectorized solution to get rid of that entire loopy process to give us average_true_range, like so -

np.convolve(true_range,np.ones((14),dtype=int),'valid')/14.0

For further performance boost, as we might have learnt from studying how CPUs are more efficient in multiplications than divisions. So, let's employ it here for a improved version -

r = 1.0/14
out = np.convolve(true_range,np.ones((14),dtype=int),'valid')*r

Runtime test -

In [53]: def original_app(true_range):
    ...:     average_true_range = np.zeros((true_range.shape[0]-13, 1))
    ...:     for i in range(13, average_true_range.shape[0]+13):
    ...:         lastn_tr = true_range[(i-13):(i+1)]
    ...:         average_true_range[i-13] = np.mean(lastn_tr)
    ...:     return average_true_range
    ...: 
    ...: def vectorized_app(true_range):
    ...:     return np.convolve(true_range,np.ones((14),dtype=int),'valid')/14.0
    ...: 
    ...: def vectorized_app2(true_range):
    ...:     r = 1.0/14
    ...:     return np.convolve(true_range,np.ones((14),dtype=int),'valid')*r
    ...: 

In [54]: true_range = np.random.rand(10000) # Input array

In [55]: %timeit original_app(true_range)
1 loops, best of 3: 180 ms per loop

In [56]: %timeit vectorized_app(true_range)
1000 loops, best of 3: 446 µs per loop

In [57]: %timeit vectorized_app2(true_range)
1000 loops, best of 3: 401 µs per loop

Massive speedups there!


Later on, the bottleneck might shift to the first part of getting true_range. To vectorize things there, here's an approach using slicing -

col0 = data[1:,0] - data[1:,1]
col1 = np.abs(data[1:,0] - data[:-1,2])
col2 = np.abs(data[1:,1] - data[:-1,2])
true_range = np.maximum(np.maximum(col0,col1),col2)
Comments