Mert Ovn - 1 year ago 75

Python Question

I need to generate an 3xn matrix having random columns ensuring that each column does not contain the same number more than once. I am currently using the below code:

`n=10`

set = np.arange(0, 10)

matrix = np.random.choice(set, size=3, replace=False)[:, None]

for i in range(n):

column = np.random.choice(set, size=3, replace=False)[:, None]

matrix = np.concatenate((matrix, column),axis=1)

print matrix

which gives the output I expected:

`[[2 1 7 2 1 9 7 4 5 2 7]`

[4 6 3 5 9 8 1 3 8 4 0]

[3 5 0 0 4 5 4 0 2 5 3]]

However, it seems that the code does not work fast enough. I am aware that implementing the for loop using cython might help, but I want to know that is there any more performant way to write this code solely in python.

Answer Source

As was already mentioned in the comments, concatenating repeatedly to a `numpy`

array is a bad idea, as you will have to reallocate memory a lot. As you already know the final size of your result array, you could simply allocate it in the begin and then just iterate over the columns:

```
matrix = np.empty((3, n), dtype=np.int)
for i in range(n):
matrix[:, i] = np.random.choice(10, size=3, replace=False)
```

At least on my machine, this is already 6 times faster, than your version.