Mad Wombat Mad Wombat - 2 months ago 6
Python Question

Make a list of ranges in numpy

I want to make a list of integer sequences with random start points. The way I would do this in pure python is


x = np.zeros(1000, 10) # 1000 sequences of 10 elements each
starts = np.random.randint(1, 1000, 1000)
for i in range(len(x)):
x[i] = np.arange(starts[i], starts[i] + 10)


I wonder if there is a more elegant way of doing this using Numpy functionality.

Answer

You can use broadcasting after extending starts to a 2D version and adding in the 1D range array, like so -

x = starts[:,None] + np.arange(10)

Explanation

Let's take a small example for starts to see what that broadcasting does in this case.

In [382]: starts
Out[382]: array([3, 1, 3, 2])

In [383]: starts.shape
Out[383]: (4,)

In [384]: starts[:,None]
Out[384]: 
array([[3],
       [1],
       [3],
       [2]])

In [385]: starts[:,None].shape
Out[385]: (4, 1)

In [386]: np.arange(10).shape
Out[386]: (10,)

Thus, looking at the shapes and putting those together, a schematic diagram of the same would look something like this -

starts         :  4
np.arange(10)  :  10

After extending starts :

starts[:,None] :  4  x  1
np.arange(10)  :       10

Thus, when we add starts[:,None] with np.arange(10), the elems of starts[:,None] would be broadcasted along its second axis 10 times corresponding to the length of the other array along that axis. For np.arange(10), it would be converted to 2D with its first dim being a singleton dim and its elems being broadcasted along it 4 times correspoinding to the length of 4 for the other array starts[:,None] along that axis. Please note that there aren't explicit replications, as under the hood the elems are broadcasted and added on the fly.

Thus, functionally we would have the replications, like so -

In [391]: np.repeat(starts[:,None],10,axis=1)
Out[391]: 
array([[3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2]])

In [392]: np.repeat(np.arange(10)[None],4,axis=0)
Out[392]: 
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

These broadcasted elems are then added to give us the desired output x.

Comments