Nick - 9 months ago 47

Python Question

I am trying to index the testdata with just the labels that equal 2 and 3. However, when I run this code, it turns my array from 2D (100 x 100) into 3D (100 x 1 x 100).

Can anyone explain why it is doing this? The last line in the code is the culprit, but I am not sure why it is happening.

`labels = testdata[:,0]`

num2 = numpy.nonzero(labels == 2)

num2 = numpy.transpose(num2)

num3 = numpy.nonzero(labels == 3)

num3 = numpy.transpose(num3)

num = numpy.vstack([num2,num3])

testdata = testdata[num,:]

Answer

When there are puzzles, print intermediate values. Better yet, run a test case in a interactive shell so you can check each value, and understand what is going on. Keep track of the shapes.

Looks like `labels`

is a 1d array of numbers like:

```
In [212]: labels=np.array([0,1,2,2,3,2,0,3,2])
```

indexes where `labels`

is 2 or 3:

```
In [213]: num2=np.nonzero(labels==2)
In [214]: num2
Out[214]: (array([2, 3, 5, 8], dtype=int32),)
In [215]: num3=np.nonzero(labels==3)
```

Here's a key step - what is the purpose of `transpose`

. Note the `num2`

is a tuple with one 1d array.

```
In [216]: num2=np.transpose(num2)
In [217]: num3=np.transpose(num3)
In [218]: num2
Out[218]:
array([[2],
[3],
[5],
[8]], dtype=int32)
```

After the transpose `num2`

is a column array, (4,1) shape.

Joining them vertically produces a (6,1) array:

```
In [220]: num=np.vstack([num2,num3])
In [221]: num
Out[221]:
array([[2],
[3],
[5],
[8],
[4],
[7]], dtype=int32)
In [222]: num.shape
Out[222]: (6, 1)
In [223]: labels[num]
Out[223]:
array([[2],
[2],
[2],
[2],
[3],
[3]])
In [224]: labels[num].shape
Out[224]: (6, 1)
```

Indexing the 1d array with that array produces another array of the same shape as the index. Indexing `x[num,:]`

does the same thing, but with the added last dimension.

If I index a (3,4) array with a (2,5) array in the 1st dimension, the result is a (2,5,4) array:

```
In [227]: np.ones((3,4))[np.ones((2,5),int),:].shape
Out[227]: (2, 5, 4)
```

Source (Stackoverflow)