off99555 - 1 year ago 32

Python Question

I have the following code in python (numpy array or scipy.sparse.matrices), it works:

`X[a,:][:,b]`

But it doesn't look elegant. 'a' and 'b' are 1-D boolean mask.

'a' has the same length as X.shape[0] and 'b' has the same length as X.shape[1]

I tried

`X[a,b]`

What I am trying to accomplish is to select particular rows and columns at the same time. For example, select row 0,7,8 then from that result select all rows from column 2,3,4

How would you make this shorter and more elegant?

Answer

You could use `np.ix_`

for such a `broadcasted indexing`

, like so -

```
X[np.ix_(a,b)]
```

Though this won't be any shorter than the original code, but hopefully should be faster. This is because we are avoiding the intermediate output as with the original code that created `X[a,:]`

with one slicing and then another slicing `X[a,:][:,b]`

to give us the final output.

Also, this method would work for `a`

and `b`

as both `int`

and `boolean`

arrays.

**Sample run**

```
In [141]: X = np.random.randint(0,99,(6,5))
In [142]: m,n = X.shape
In [143]: a = np.in1d(np.arange(m),np.random.randint(0,m,(m)))
In [144]: b = np.in1d(np.arange(n),np.random.randint(0,n,(n)))
In [145]: X[a,:][:,b]
Out[145]:
array([[17, 81, 64],
[87, 16, 54],
[98, 22, 11],
[26, 54, 64]])
In [146]: X[np.ix_(a,b)]
Out[146]:
array([[17, 81, 64],
[87, 16, 54],
[98, 22, 11],
[26, 54, 64]])
```

**Runtime test**

```
In [147]: X = np.random.randint(0,99,(600,500))
In [148]: m,n = X.shape
In [149]: a = np.in1d(np.arange(m),np.random.randint(0,m,(m)))
In [150]: b = np.in1d(np.arange(n),np.random.randint(0,n,(n)))
In [151]: %timeit X[a,:][:,b]
1000 loops, best of 3: 1.74 ms per loop
In [152]: %timeit X[np.ix_(a,b)]
1000 loops, best of 3: 1.24 ms per loop
```

Source (Stackoverflow)