Merlin - 1 year ago 86
Python Question

# Create Numpy array without enumerating array

Starting with this:

``````x = range(30,60,2)[::-1];
x = np.asarray(x); x

array([58, 56, 54, 52, 50, 48, 46, 44, 42, 40, 38, 36, 34, 32, 30])
``````

Create an array like this: (Notice, first item repeats) But if I can get this faster without the first item repeating, I can
`np.hstack`
first item.

``````[[58 58 56 54 52]
[56 56 54 52 50]
[54 54 52 50 48]
[52 52 50 48 46]
[50 50 48 46 44]
[48 48 46 44 42]
[46 46 44 42 40]
[44 44 42 40 38]
[42 42 40 38 36]
[40 40 38 36 34]
[38 38 36 34 32]
[36 36 34 32 30]
[34 34 32 30 None]
[32 32 30 None None]
[30 30 None None None]]
``````

The code below works, want it faster without 'for' loop and enumerate.

``````arr = np.empty((0,5), int)

for i,e in enumerate(x):
arr2 = np.hstack((x[i], x[i:i+4], np.asarray([None]*5)))[:5]
arr  = np.vstack((arr,arr2))
``````

Approach #1

Here's a vectorized approach using `NumPy broadcasting` -

``````N = 4 # width factor
x_ext = np.concatenate((x,[None]*(N-1)))
arr2D = x_ext[np.arange(N) + np.arange(x_ext.size-N+1)[:,None]]
out = np.column_stack((x,arr2D))
``````

Approach #2

Here's another one using `hankel` -

``````from scipy.linalg import hankel

N = 4 # width factor
x_ext = np.concatenate((x,[None]*(N-1)))
out = np.column_stack((x,hankel(x_ext[:4], x_ext[3:]).T))
``````

Runtime test

Here's a modified version of `@Aaron's benchmarking script` using an input format for this post identical to the one used for his post in that script for a fair benchmarking and focusing just on these two approaches -

``````upper_limit = 58 # We will edit this to vary the dataset sizes

print "Timings are : "
t = time()
for _ in range(1000):  #1000 iterations of @Aaron's soln.
width = 3
x = np.array(range(upper_limit,28,-2) + [float('nan')]*width)
arr = np.empty([len(x)-width, width+2])
arr[:,0] = x[:len(x)-width]
for i in xrange(len(x)-width):
arr[i,1:] = x[i:i+width+1]
print(time()-t)

t = time()
for _ in range(1000):
N = 4 # width factor
x_ext = np.array(range(upper_limit,28,-2) + [float('nan')]*(N-1))
arr2D = x_ext[np.arange(N) + np.arange(x_ext.size-N+1)[:,None]]
out = np.column_stack((x_ext[:len(x_ext)-N+1],arr2D))
print(time()-t)
``````

Case #1 (upper_limit = 58 ) :

``````Timings are :
0.0316879749298
0.0322730541229
``````

Case #2 (upper_limit = 1058 ) :

``````Timings are :
0.680443048477
0.124517917633
``````

Case #3 (upper_limit = 5058 ) :

``````Timings are :
3.28129291534
0.47504901886
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download